How to use this Python script? - Bro where is install.exe

sodomy lifestyle · 3 de Oct, 2025

Hi, I'm trying to collect Cyraxx soundbites, and figured the easiest way to comb through literal days of whining is to download the YouTube transcripts and search them for key phrases. I found this useful looking tool on GitHub:

youtube-bulk-transcript

What is the easiest way to utilize it? Do I need to resort to a command prompt and/or batch files, or is there a graphic command interface available that I could use? I'm an absolute coding nigger.

Jeff Buckley · 3 de Oct, 2025

Read the README

Calvin Gabriel · 3 de Oct, 2025

Download the python interpreter here, my guess is you'll want the Windows installer 64bit version, and install it.

Download the github repository and open the folder where it's located. Open the command prompt (cmd.exe) at that location (I think on windows you can do this with shift+right click in the file explorer; not sure).

Now you need to pull the project's dependencies.

Código:

python -m venv venv
.venv\Scripts\activate
pip install -r requirements.txt

Now you should be able to use the commands from the README to download your transcripts.

EDIT: I tried the script and it's out of date. I managed to fix it up. If you send me the channel/playlist that you want scraped, I'll do it for you and send you the transcripts in a zip. Or you can try yourself with this edited version.

sodomy lifestyle · 3 de Oct, 2025

Calvin Gabriel dijo:
If you send me the channel/playlist that you want scraped, I'll do it for you and send you the transcripts in a zip.

If you're willing to do a favor to a computing mong, the channel I'm eyeing is www.youtube.com/@GoblinRecordsOfficial . With over a thousand clips, it'll keep me busy scrubbing probably for the rest of the year. If the script allows attaching the related video URL and title to each transcript, that would be helpful.

Calvin Gabriel · 3 de Oct, 2025

The script is quite dumb and doesn't account for rate limiting, so I'm IP banned atm lol. I might try again tommorow.

Ed Special · 3 de Oct, 2025

Calvin Gabriel dijo:
The script is quite dumb and doesn't account for rate limiting, so I'm IP banned atm lol. I might try again tommorow.

"Doesn't account for", or "Predates"? The script was last updated two years ago.

MongolianMongoose · 3 de Oct, 2025

Are you familiar with using yt-dlp to download Youtube videos? It can also download subtitles.
If you have yt-dlp installed properly you can run this command in a terminal.

Python:

yt-dlp --skip-download --ignore-config --ignore-errors --write-subs --write-auto-subs https://www.youtube.com/channel/UCgpVO5oxAh7oMk3vynU-2Vg

BTW make sure you've updated yt-dlp recently if you try this.

sodomy lifestyle · 3 de Oct, 2025

MongolianMongoose dijo:
Are you familiar with using yt-dlp to download Youtube videos? It can also download subtitles.

Does it allow downloading all of a channel's subtitles on one go? I'm seeking a solution where I can avoid having to pick individual clips. There are several online services that offer something similar, but the best one I've found has a limit of transcripts of the hundred most recent clips.

agent_donkey · 3 de Oct, 2025

> resort to a command prompt

agent_donkey · 3 de Oct, 2025

Here.

P.S. Its a compressed tape archive so you don't feel like it was free.

Edit: I can't attach it apparently: https://files.catbox.moe/yg0vgw.gz
The original extension is ".tar.gz".

MongolianMongoose · 3 de Oct, 2025

sodomy lifestyle dijo:
Does it allow downloading all of a channel's subtitles on one go?

Yes. My example is preconfigured to do just that. Just put in your own url into the command.

sodomy lifestyle · 3 de Oct, 2025

agent_donkey dijo:
Here.

P.S. Its a compressed tape archive so you don't feel like it was free.

Edit: I can't attach it apparently: https://files.catbox.moe/yg0vgw.gz
The original extension is ".tar.gz".

Oh dear, this just leads to another issue. Now I would need a way to batch convert the .vtt format to plaintext.

:thinking:

[edit]

Essentially the ideal end result is an uninterrupted wall of text per clip, perfect for searching long phrases.

agent_donkey · 3 de Oct, 2025

sodomy lifestyle dijo:
Now I would need a way to batch convert the .vtt format to plaintext.

Try this.

Smaug's Smokey Hole 2 · 4 de Oct, 2025

sodomy lifestyle dijo:
Oh dear, this just leads to another issue. Now I would need a way to batch convert the .vtt format to plaintext.

[edit]

Essentially the ideal end result is an uninterrupted wall of text per clip, perfect for searching long phrases.

Ver archivo adjunto 7993816

Download Notepad++ and use Regex to remove every line that contains a '>' to remove time stamps and so on then replace all line breaks with a space. I'm not sure how that format is structured but just find a pattern in the lines you want removed then replace the line breaks. Now you have a block of text.

Don't know regex? Co-pilot and other LLMs do! Specify that you want to use it in Notepad++. You can dump all the files into NPP and then run it on all open documents. Not the fanciest solution but...

Aidan · 4 de Oct, 2025

Calvin Gabriel dijo:
The script is quite dumb and doesn't account for rate limiting, so I'm IP banned atm lol. I might try again tommorow.

Use yt-dlp as @MongolianMongoose pointed out and add a wait in between each pull to help avoid being limited for future reference. yt-dlp makes it easy to pick up where you left off as well and combined with a VPN you can get around a lot of their countermeasures.

sodomy lifestyle dijo:
Oh dear, this just leads to another issue. Now I would need a way to batch convert the .vtt format to plaintext.

[edit]

Essentially the ideal end result is an uninterrupted wall of text per clip, perfect for searching long phrases.

Ver archivo adjunto 7993816

ffmpeg can do this but he threw it all in one big file which means you'd need to break it back up.

You can use this to download them yourself, add --sleep-subtitles X where X is however many seconds if you want to sleep in between.

Código:

yt-dlp --write-auto-subs --write-subs --convert-subs "srt" --download-archive archive.txt --skip-download --no-post-overwrites  -o "%(uploader)s/%(title)s [%(id)s].%(ext)s"  -a video_ids.txt --exec "before_dl:echo REMOVED ID: \"%(id)s\"; sed -i '/%(id)s/d' video_ids.txt"

This won't actually make the archive.txt file but it can't hurt to keep there for reference. The video_ids.txt file is attached with the ID of each video on the channel so it's easier to pause and resume since there are so many videos. It uses that file to get the list of videos and then removes the ID as it goes using sed.

You can say "Well I don't want to use the command line" but you're making it harder on yourself and that's on you. No one wrote a program to do this very specific task for you.

Once you have all the subtitles downloaded, you can parse them again using sed or whatever your favorite method is. I think srt is the simplest format to edit since each non-caption line is easy to parse out and then you can remove the newlines which is trivial.
Ask for help as you need it.

Aidan · 5 de Oct, 2025

I went through and did what I think you're after for all of the transcriptions I could get, many of them require an account to be signed in. If you use the command above and add --cookies-from-browser chrome (for example) you can get some of those from the attached txt file. If you get those and want them parsed out into blocks of text, share them with me and I'll do the same thing to them that I did the others.

There are some more that I would need to re-verify that are also not in the 7zip file but I don't feel like it right now.

How to use this Python script? - Bro where is install.exe

sodomy lifestyle

Raising money for a bail bond

Jeff Buckley

Dream, brother.

Calvin Gabriel

I'm ready. You ready?

Archivos adjuntos

sodomy lifestyle

Raising money for a bail bond

Calvin Gabriel

I'm ready. You ready?

Ed Special

MongolianMongoose

sodomy lifestyle

Raising money for a bail bond

agent_donkey

agent_donkey

MongolianMongoose

sodomy lifestyle

Raising money for a bail bond

agent_donkey

Smaug's Smokey Hole 2

Forgettable2

Aidan

Archivos adjuntos

Aidan

Archivos adjuntos