Free Transcriptions in Resolve using OpenAI Whisper

Get answers to your questions about color grading, editing and finishing with DaVinci Resolve.
  • Author
  • Message
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Free Transcriptions in Resolve using OpenAI Whisper

PostSat Oct 01, 2022 12:58 pm

A few days ago OpenAI released publicly Whisper, their Speech Recognition model which is unlike we've ever seen before, so we created a free tool for Resolve called StoryToolkitAI that basically transcribes Timelines into Subtitle SRTs which can be imported back into Resolve.

Whisper recognizes speech from 97 languages and can translate them into English. So far, we've tried it on footage in English, Spanish, German, and Chinese, and it's really impressive.

StoryToolkitAI can be downloaded for free at the following link (it's written in Python so some knowledge to install it is required): https://github.com/octimot/StoryToolkitAI

Once you install the necessary Python packages on your machine, you can simply go to the timeline in Resolve, press a button in the tool, and wait for the transcription to be processed. When done, you get all the phrases in a JSON file and the subtitles as SRT. Everything is done locally without the need for additional accounts or even an Internet connection, once you have all the packages installed.

Again, this is a free tool that we used in our editing process for almost a week now, but it's quite raw and may be buggy! We may add in the future other learning models like GPT-3 or CLIP to integrate other cool features like content summarization, automated markers etc.

If you're interested in contributing to this open source project, just get in touch! :D
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Andy Mees

  • Posts: 3208
  • Joined: Wed Aug 22, 2012 7:48 am

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostSat Oct 01, 2022 9:50 pm

Thanks for sharing, Octavian. It's much appreciated.
Was only reading about this just the other day, so it's great to see a working model for Resolve out in the wild so soon. Good work.
Cheers
Andy
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostMon Oct 03, 2022 6:11 am

Thanks Andy!

For now the tool only generates the transcript in a txt file as well as the subtitles in SRT, but the plan is to add more functions soon, like a text UI, speaker recognition and possibly text ideas split into timeline markers.

Again, it's quite amazing how good are the transcriptions and the translations, and that's why we use this exclusively now...

It would be nice to get more feedback from the community to understand what people need in their editing workflows which can be provided by AI / machine learning, and build this open source for everyone to use for free.
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

BaGRoS

  • Posts: 61
  • Joined: Fri Aug 27, 2021 8:58 am
  • Real Name: Miroslaw Bagrowski

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostMon Oct 03, 2022 6:24 am

I have not had time to test it, but at first glance, it could be very interesting.

Many thanks,
BaGRoS
BaGRoS
MSI GP66 Leopard / 11th Gen Intel(R) Core(TM) i7-11800H
64GB RAM / GTX3080 8GB
M.2 1TB+2TB Samsung 980PRO
SanDisk Extreme 1TB SSD
Seagate One Touch Hub, 4 TB
Speed Editor + DaVinci Resolve Studio 18.6.4
Offline

BaGRoS

  • Posts: 61
  • Joined: Fri Aug 27, 2021 8:58 am
  • Real Name: Miroslaw Bagrowski

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostMon Oct 03, 2022 8:03 pm

I tested with a recording in Polish, recorded outside when a fairly strong wind was blowing. Shame it doesn't use my graphics card, but still the end result amazing. Two people speaking, getting into each other's words. The programme only made a mistake once, making a spelling error. However, I, listening to the recording, had to guess, because the word was cut off, which means it can be assumed that it is not an AI error but a video editing error. Next tests with better recorded audio.

I'm waiting for more updates, better use of the graphics card, some kind of specific installer with automatic updating of all components.

MANY THANKS!
BaGRoS
MSI GP66 Leopard / 11th Gen Intel(R) Core(TM) i7-11800H
64GB RAM / GTX3080 8GB
M.2 1TB+2TB Samsung 980PRO
SanDisk Extreme 1TB SSD
Seagate One Touch Hub, 4 TB
Speed Editor + DaVinci Resolve Studio 18.6.4
Offline
User avatar

visualfeast

  • Posts: 584
  • Joined: Sat May 19, 2018 6:51 pm
  • Real Name: BEN JORDAN

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostMon Oct 03, 2022 10:41 pm

Windows 10 here. Do I have to be running Python 3.9? I have 3.10.7 and all dependencies checked out through requirements. But when I run the app.py, it says "Starting GUI" and nothing happens. Any ideas?
•Ryzen 5950x/64G/75TB RAID/3080ti/Intensity Pro 4K/2x ProArt PA278CGV/Dell U2415/Shogun 7/HPE LTO-6/Stream Deck XL
•ZBook 17 G3/64G/Quadro M5000M
•Inspiron 16+/32G/RTX3060
Resolve Studio v18.1.2 (x2)/Windows 10 Pro 22H2
Offline
User avatar

iddos-l

  • Posts: 798
  • Joined: Sat Mar 30, 2019 7:55 pm
  • Real Name: iddo lahman

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Oct 04, 2022 5:14 am

Sounds great!
I will check it very soon
Thanks for sharing.
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Oct 04, 2022 5:15 am

BaGRoS wrote:I tested with a recording in Polish, recorded outside when a fairly strong wind was blowing. Shame it doesn't use my graphics card, but still the end result amazing.

I'll push an update to have it select Cuda automatically, if available, in the next days. We ran the code on an old 1070 GPU and the transcription time is around a quarter of the lenght of the audio. Just check back in the next days!

What are your specs btw?

BaGRoS wrote:I'm waiting for more updates, better use of the graphics card, some kind of specific installer with automatic updating of all components.

Noted, thanks for the feedback!
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Oct 04, 2022 5:21 am

visualfeast wrote:Windows 10 here. Do I have to be running Python 3.9? I have 3.10.7 and all dependencies checked out through requirements. But when I run the app.py, it says "Starting GUI" and nothing happens. Any ideas?

Yes, there's a weird bug with Python on Windows 10, but only on some machines (one of our Win machines included). Trying to find a fix...

BTW, although this will not solve the above issue, I recommend running 3.9.13 as this is our production environment. The app works fine in 3.10, but we might add some machine learning libraries that don't support 3.10 yet. Do you know how to install 3.9 and have it run in virtualenv on windows? I recommend pylauncher and virtuallenv.
Last edited by Octavian Mot on Tue Oct 04, 2022 9:40 am, edited 2 times in total.
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

BaGRoS

  • Posts: 61
  • Joined: Fri Aug 27, 2021 8:58 am
  • Real Name: Miroslaw Bagrowski

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Oct 04, 2022 5:25 am

Laptop Windows11 ; Intel i7 11800k, gtx3080 8GB + 32GB shared memory
64GB RAM, two Samsung 980 PRO PCIe 4.0 M.2 1TB + 2TB
Nothing special but, still smooth.

Many thanks.


Sent from my SM-S908B using Tapatalk
BaGRoS
MSI GP66 Leopard / 11th Gen Intel(R) Core(TM) i7-11800H
64GB RAM / GTX3080 8GB
M.2 1TB+2TB Samsung 980PRO
SanDisk Extreme 1TB SSD
Seagate One Touch Hub, 4 TB
Speed Editor + DaVinci Resolve Studio 18.6.4
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Oct 04, 2022 10:08 am

BaGRoS wrote:I'm waiting for more updates, better use of the graphics card, some kind of specific installer with automatic updating of all components.

Just pushed an update for CUDA support, but make sure you install cuda torch as per README. ;-)
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Dani Urdiales

  • Posts: 18
  • Joined: Mon Jan 13, 2020 6:57 pm
  • Real Name: Dani Urdiales

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Oct 04, 2022 11:45 am

Wow, amazing. I just tested it in a documentary that I'm editing now and it did an excellent job transcribing two old men talking, even very quiet and talking both at the same time.

I could help development when I have time in between jobs. By now I propose a simple add-on, asking for the language. I hard write it because in the first test it interpreted the spoken language as galego instead of spanish.

Thank you very much for this impressive tool. It's going to help a lot in documentary production.

Enviado desde mi Mi 9 Lite mediante Tapatalk
Offline

Cobrar980

  • Posts: 2
  • Joined: Mon Oct 03, 2022 8:17 pm
  • Real Name: Kevin Gerenda

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Oct 04, 2022 2:02 pm

Thanks for sharing!
Lenovo Legion 5
Win 11
AMD Ryzen 7 5800H
16GB RAM
RTX 3050Ti
512GB & 1GB NVME
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Oct 04, 2022 3:55 pm

Dani Urdiales wrote:Wow, amazing. I just tested it in a documentary that I'm editing now and it did an excellent job transcribing two old men talking, even very quiet and talking both at the same time.

I could help development when I have time in between jobs. By now I propose a simple add-on, asking for the language. I hard write it because in the first test it interpreted the spoken language as galego instead of spanish.

I'm glad that this is useful for you too! We're currently transcribing over 96 hours of footage for a doc in 5 languages and it's making really few errors!

The functions definitely need more user input, I'll try to make these available on a very soon update. If you want to contribute, just PM or get in touch directly on Github.
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

BaGRoS

  • Posts: 61
  • Joined: Fri Aug 27, 2021 8:58 am
  • Real Name: Miroslaw Bagrowski

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Oct 04, 2022 8:50 pm

1m44sec movie -> CUDA - 17sec transcribe with medium model.

Next update:
1. Possibility to choose model size from GUI
2. Possibility to choose CUDA / CPU (for some reason torch doesn't want to allocate shared memory :( so I can only use the LARGE model on the CPU :( )
3. Perhaps configuring the software in such a way that it allows shared memory allocation at the expense of speed.


Overall, thanks for a super fun time :)

BaGRoS
BaGRoS
MSI GP66 Leopard / 11th Gen Intel(R) Core(TM) i7-11800H
64GB RAM / GTX3080 8GB
M.2 1TB+2TB Samsung 980PRO
SanDisk Extreme 1TB SSD
Seagate One Touch Hub, 4 TB
Speed Editor + DaVinci Resolve Studio 18.6.4
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Oct 05, 2022 6:24 am

1m44sec movie -> CUDA - 17sec transcribe with medium model.
That's significantly better than in our tests. Cool!

Next update:
1. Possibility to choose model size from GUI
2. Possibility to choose CUDA / CPU (for some reason torch doesn't want to allocate shared memory :( so I can only use the LARGE model on the CPU :( )
3. Perhaps configuring the software in such a way that it allows shared memory allocation at the expense of speed.
Noted, thanks!
For detailed tweaks, feel free to use the Issues tab in GitHub. Having a centralized place with infos/suggestions would help other users too.
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

aaronvandomelen

  • Posts: 81
  • Joined: Sun Mar 01, 2020 3:11 am
  • Real Name: Aaron Van Domelen

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 06, 2022 7:34 pm

This is incredible, wow! I need to spend some time testing this out.

Do you have a method to import as ranged markers (kinda of like Simon Says?)
current config = iMac 2019 (19,1), 3.6 GHz 8-Core Intel Core i9, 40 GB RAM ,Radeon Pro Vega 48 8 GB, Mac OSX 10.15.3
Offline

Constantin Gross

  • Posts: 48
  • Joined: Sun May 07, 2017 4:29 pm

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 06, 2022 8:15 pm

aaronvandomelen wrote:This is incredible, wow! I need to spend some time testing this out.

Do you have a method to import as ranged markers (kinda of like Simon Says?)


You can convert subtitle SRT files to ranged markers with this tool:

https://en.editingtools.io/subtitles/
Offline

aaronvandomelen

  • Posts: 81
  • Joined: Sun Mar 01, 2020 3:11 am
  • Real Name: Aaron Van Domelen

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostFri Oct 07, 2022 3:52 am

Constantin Gross wrote:
aaronvandomelen wrote:This is incredible, wow! I need to spend some time testing this out.

Do you have a method to import as ranged markers (kinda of like Simon Says?)


You can convert subtitle SRT files to ranged markers with this tool:

https://en.editingtools.io/subtitles/


Whoa thank you!
current config = iMac 2019 (19,1), 3.6 GHz 8-Core Intel Core i9, 40 GB RAM ,Radeon Pro Vega 48 8 GB, Mac OSX 10.15.3
Offline
User avatar

AnthonyReno

  • Posts: 170
  • Joined: Mon May 23, 2022 9:58 am
  • Location: USA
  • Real Name: Mark Reno

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostFri Oct 07, 2022 5:29 am

Nicely done! Thanks for sharing!
DR & F Studio v18.1.1,Win11Pro, i9-13900K, 128GB RAM
GPUs:Intel UHD 770 & RTX 3090ti
OS:1.8TB SSD,P&C drives:2x2TB SSD
Speed Editor, Pen:Huion Inspiroy Dial 2 & XPPen Artist 13.3 Pro, Elecom HUGE Trackball
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostFri Oct 07, 2022 2:54 pm

Thanks for the great feedback! :D

Keep it coming, we're really interested in how people approach editing in Resolve and how to develop a tool that is actually helpful.

I'm pushing almost a daily update right now on GitHub, so make sure you're using the latest version. Things are still super raw and maybe buggy here and there, but we use the tool constantly in our editing room at this point. Having these locally generated high quality transcripts really improved our workflow and are changing our editing pipeline.

I'll try to push these updates as soon as possible:
- in tool transcript search
- sync transcript with Resolve playhead
- transcript editing
- markers in Resolve via transcript selection
- automatic opening of transcripts on timeline change
- batch transcription of multiple Timelines
(now you can only render one at a time, although you can queue up transcriptions)

I wish that the Resolve API wasn't so limited in certain aspects to implement some of these ideas faster without trying to hack into it too much, but I guess there's no other option but to work with what we have...
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Dani Urdiales

  • Posts: 18
  • Joined: Mon Jan 13, 2020 6:57 pm
  • Real Name: Dani Urdiales

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Oct 11, 2022 11:08 am

Hi, I've been trying the c++ implementation from ggerganov, and it's transcribing much faster for Intel cpu+amd GPU mac's. It take advantage of multiple cores.

https://github.com/ggerganov/whisper.cpp

Would it be possible to add an option to use this implementation in your story toolkit?


Enviado desde mi Mi 9 Lite mediante Tapatalk
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Oct 12, 2022 6:40 am

Dani Urdiales wrote:Would it be possible to add an option to use this implementation in your story toolkit?

That's interesting!

I think it would be complicated for people without coding experience to install that, since it's my understanding that you have to compile the code from scratch on your machine.

But if you can think of a way to make that work, let's talk about it on the project GitHub: https://github.com/octimot/StoryToolkitAI

Also, if there are more people interested in this, let me know and maybe we can prioritize it. 8-)
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 13, 2022 4:24 pm

FWIW, I just made an update that is really speeding up our editing workflow.

Now, you can navigate the timeline using the transcription via shortcuts, select phrases and mark them directly in Resolve.

https://vimeo.com/759962195/dee07a067a
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Andy Mees

  • Posts: 3208
  • Joined: Wed Aug 22, 2012 7:48 am

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 13, 2022 8:09 pm

Thats very nice Octavian, looks like I might have to upgrade my work machine to v18.
Offline

cmactavish

  • Posts: 87
  • Joined: Fri Aug 28, 2020 1:38 pm
  • Real Name: Chay MacTavish

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 20, 2022 12:07 am

Have been playing around with this on an older Intel iMac, and despite a few UI bugs and a couple of setup hiccups, it is a game changer. It 100% nailed an English transcription as well as a Portuguese to English translation + transcription. It struggled with a Indonesian to English translation + transcription, but think that was a tough ask and more a Whisper AI problem rather than this tool.

Will be trying it out again on some M1 machines and will report back.
Offline
User avatar

Uli Plank

  • Posts: 21291
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 20, 2022 1:40 am

We’re those problems with Indonesian massive or just a few? I may need that soon, so I’m curious.
No, an iGPU is not enough, and you can't use HEVC 10 bit 4:2:2 in the free version.

Studio 18.6.5, MacOS 13.6.5
MacBook M1 Pro, 16 GPU cores, 32 GB RAM and iPhone 15 Pro
Speed Editor, UltraStudio Monitor 3G, iMac 2017, eGPU
Offline

cmactavish

  • Posts: 87
  • Joined: Fri Aug 28, 2020 1:38 pm
  • Real Name: Chay MacTavish

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 20, 2022 2:22 am

We think the problem was perhaps dialect based. We can't be certain that the language being spoken was the same 'Indonesian' that WhisperAI thought it was. It was able to latch on the certain words, but then completely skipped other parts.
Offline
User avatar

Uli Plank

  • Posts: 21291
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 20, 2022 2:45 am

Well, it could be mixed with one of the local languages. There is the nationwide "Bahasa Indonesia", which is well standardised, and lots of local languages, which I wouldn't expect to be supported in whole. And then there's Jakartian slang…

So, I'll se how well it works when our project starts.
No, an iGPU is not enough, and you can't use HEVC 10 bit 4:2:2 in the free version.

Studio 18.6.5, MacOS 13.6.5
MacBook M1 Pro, 16 GPU cores, 32 GB RAM and iPhone 15 Pro
Speed Editor, UltraStudio Monitor 3G, iMac 2017, eGPU
Offline

cmactavish

  • Posts: 87
  • Joined: Fri Aug 28, 2020 1:38 pm
  • Real Name: Chay MacTavish

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 20, 2022 2:52 am

Uli Plank wrote:Well, it could be mixed with one of the local languages. There is the nationwide "Bahasa Indonesia", which is well standardised, and lots of local languages, which I wouldn't expect to be supported in whole. And then there's Jakartian slang…

So, I'll se how well it works when our project starts.


Yeah, this was our thinking. Perhaps a few common words, but otherwise a different language essentially.
Offline

ColinMcT

  • Posts: 54
  • Joined: Sat Jul 10, 2021 11:14 pm
  • Real Name: Colin McTaggart

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 20, 2022 6:21 am

I'd love to be able to help in testing. Any tutorials on how to set up and use?
This could be a great alternative to Script Sync used in Media Composer?
DaVinci Resolve Studio 17.4.3
2019 27" iMac 40gig ram
BM Speed Editor
BM Editor Keyboard
Ultrastudio mini 4K
BM Micro Panel
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 20, 2022 6:53 am

ColinMcT wrote:I'd love to be able to help in testing. Any tutorials on how to set up and use?
This could be a great alternative to Script Sync used in Media Composer?


Sure, there is a lot of info on the project Github page: https://github.com/octimot/StoryToolkitAI

For installation instructions, go here: https://github.com/octimot/StoryToolkit ... LLATION.md
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostMon Oct 24, 2022 6:32 am

We're currently working on an easier way to get the app running for folks who aren't much into python/coding/etc..

On this page, we'll slowly add releases of the tool for different platforms: https://github.com/octimot/StoryToolkitAI/releases

Right now, Mac M1, Mac Intel and Windows (with CUDA GPUs) versions are available.
Last edited by Octavian Mot on Thu Nov 10, 2022 10:57 am, edited 2 times in total.
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Sam Steti

  • Posts: 2470
  • Joined: Tue Jun 17, 2014 7:29 am
  • Location: France

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostMon Oct 24, 2022 7:04 am

Hello there,

I'll check that this week... I'll report here on whatever may be interesting...
Thank you anyway to make it available so far.
*MacMini M1 16 Go - Ext nvme SSDs on TB3 - 14 To HD in 2 x 4 disks USB3 towers
*Legacy MacPro 8core Xeons, 32 Go ram, 2 x gtx 980 ti, 3SSDs including RAID
*Resolve Studio everywhere, Fusion Studio too
*https://www.buymeacoffee.com/videorhin
Offline

ColinMcT

  • Posts: 54
  • Joined: Sat Jul 10, 2021 11:14 pm
  • Real Name: Colin McTaggart

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 27, 2022 2:40 am

Octavian Mot wrote:We're currently working on an easier way to get the app running for folks who aren't much into python/coding/etc..

On this page, we'll slowly add releases of the tool for different platforms: https://github.com/octimot/StoryToolkitAI/releases

Right now, a Mac M1 and Mac Intel version is available.

that will be awesome for us folks that have no knowledge of python etc
DaVinci Resolve Studio 17.4.3
2019 27" iMac 40gig ram
BM Speed Editor
BM Editor Keyboard
Ultrastudio mini 4K
BM Micro Panel
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Oct 27, 2022 4:11 am

ColinMcT wrote:that will be awesome for us folks that have no knowledge of python etc

Have you tried the standalone version for MacOS? You only need to download it, install homebrew and ffmpeg with cca. 2-3 commands (if all works fine) and it should work.
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Bookworm

  • Posts: 1
  • Joined: Sat Oct 29, 2022 11:04 pm
  • Real Name: Rhyz Presser

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostSat Oct 29, 2022 11:45 pm

I recently tried it myself and it seems pretty up there. There were a few errors, especially when I was the speaker. That was expected though as I tend to stutter and stumble through my words, which would lead the program to get confused. It ended up processing 220 minutes of audio in 26 minutes via cuda though on the medium model which is impressive.
I am having a lot of trouble getting it to work with Davinci studio on Windows, but I think that is a user issue as I am very new to Python. As in only got it for this new.
One thing I would like to see with the program, and I'm not sure if it has it yet or not, is to have the functionality of adding parts of the transcript to Davinci's title editor to place and modify how we see fit. I suppose we can kind of do that already though by matching the title's length up with the markers from the transcript.
Something else I would love to see, again it might already be there, is for the transcript to designate who is saying what. I'm already trying that by adding (Person 1) and (Person 2) into the prompt and I'll update later if it works. I'm also testing out the large model to see how and if it differs but that'll take significantly longer as I have to us my cpu because it needs 20 gig of memory and my gpu maxes out at 8.

Overall, I like this program and would probably love it once I can get it to talk with Davinci.

EDIT: It now works with Davinci thanks to Octavian's help. My speaker test didn't work though and I still don't know how the large model differs as it took too long for me.
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Nov 03, 2022 8:43 am

Bookworm wrote:Overall, I like this program

Thanks for the feedback!

The speaker recognition is on the works and listed as a future feature on the project page: https://github.com/octimot/StoryToolkitAI#key-features

There's actually a lot of stuff that you can do now from the tool once you have the timelines transcribed, like searching for stuff and taking the resolve playhead to the position of your text, marking different blocks of text in Resolve etc.

The idea is to optimize the editing process using AI, so imho there are some old editing habits that might not be necessary once everything is up and running - and that's why we're prioritizing the integration of AI based features (semantic search, summarization, hopefully visual recognition via CLIP etc.) over some stuff that folks are used to (like speaker recognition etc.)

Feel free to ask stuff or add feature requests on the project issues page: https://github.com/octimot/StoryToolkitAI/issues

This way we'll keep the Blackmagic forums noise free and keep all the ideas in one place.
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

rnbaker

  • Posts: 60
  • Joined: Thu Aug 22, 2019 10:40 am
  • Real Name: Ray Baker

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Nov 08, 2022 4:31 pm

Octavian,
Thanks so much for developing and sharing this. I am not on GitHub yet but I will be creating an account soon. However, I have one suggestion for your installation instructions...

Change #7 (and at the very end) in the Windows installation to venv\Scripts\activate instead of activate.bat

What I found is...
"Just run activate, without an extension, so the right file will get used regardless of whether you're using cmd.exe or PowerShell."

Or, maybe it is my mistake that I was using PowerShell but that's what I always use. Thanks again for everything!


Edit: Never mind, it worked the first time and now isn't working so I guess the good ol' command prompt and venv\Scripts\activate.bat is the way to go! Sorry!
Last edited by rnbaker on Tue Nov 08, 2022 11:09 pm, edited 1 time in total.
Offline

Videoneth

  • Posts: 1615
  • Joined: Fri Nov 13, 2020 11:03 pm
  • Warnings: 1
  • Real Name: Maxwell Allington

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Nov 08, 2022 4:38 pm

I think I'm gonna give it a try.
I already have Whisper installed in a virtual environment (on windows), everything is working.

Doing a : pip install git+https://github.com/openai/whisper.git - in my venv should be just fine right? (and install the package required if there is more than those already installed for Whisper)

Btw : https://github.com/jianfch/stable-ts << someone made a script for word level timestamps, works with Whisper.

Edit:
You should check out this repo : https://github.com/antiboredom/videogrep
Code: Select all
Videogrep is a command line tool that searches through dialog in video files and makes supercuts based on what it finds. It will recognize .srt or .vtt subtitle tracks, or transcriptions that can be generated with vosk, pocketsphinx, and other tools.

If your tool had a "side" script like this, that could create "regular" EDL files too, because this one create a particular flavor EDL file... it doesn't open in Resolve. It can export XML files too, but imported as timeline, video and audio is not linked and it's a pain to use.

I say that because it's a super useful tool (at least for me). Find and extract the only part I need, based on what it said, on long videos.. it's pretty cool.

Edit:
I'm not a dev, but would it be better to have the logs and conf files in the same directory as the github install? So the files would not be scattered in different folders (that's one thing I dislike with some application, when they put tings in many different folders, appdata, "user" folder, roaming and local.

I installed in my Whisper venv :
Code: Select all
pkg_resources.VersionConflict: (typing-extensions 4.4.0 (c:\tools\ai\whisper\lib\site-packages), Requirement.parse('typing_extensions==4.3.0'))
< It looks like when Whisper is installed first alone, it installs a newer version of typing-extensions

Same for pkg_resources.VersionConflict: (tokenizers 0.13.1 (c:\tools\ai\whisper\lib\site-packages), Requirement.parse('tokenizers==0.12.1'))

Edit 2:
I installed the correct version of these two, the script starts (no popup of the app appears) .. I see
Code: Select all
INFO: Running StoryToolkitAI version 0.16.18
INFO: Using cuda for Torch / Whisper.
, then it stops after 10 secs

(For info, I have Python 3.10 installed)
Windows 10
18.6.6
nVidia 3090 - 537.42
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Nov 08, 2022 6:34 pm

@Videoneth

I'll look into the stuff you mentioned, thanks! There's a lot of stuff that can be extended and we're constantly building it up also based on our editing needs.

For eg., I just pushed a really cool feature which searches your transcripts by concepts and meaning using AI. It's raw, but already does a pretty good job and it runs locally on your machine.

Regarding installing on windows, please see the detailed installation instructions here: https://github.com/octimot/StoryToolkit ... md#windows

Anything not per instructions is difficult to predict (for eg. Python 3.10 instead of 3.9) because we're using some experimental packages which are quite picky about their dependencies. But once you get it working, it really works. :D

If you still have problems installing, feel free to write here https://github.com/octimot/StoryToolkitAI/issues/9
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Videoneth

  • Posts: 1615
  • Joined: Fri Nov 13, 2020 11:03 pm
  • Warnings: 1
  • Real Name: Maxwell Allington

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Nov 08, 2022 7:01 pm

Yeah I tried this way first because I have so many "AI" python app, in different environment, with always these big different version of pytorch packages taking so much space.

Btw, can "app.py" take the "--model_dir" arg so when it passes it to Whisper, then it doesn't download again the model (I have the medium and small ones in a specific folder, I like having these files in the same places :D).

I really often have to search specific things in 200-400 videos files (which have each a .str file assoticated to them, and when they don't, I use Whisper for it.. but it's manual for each file, it would be create to have something ( ;) ) that scan a folder and transcribe the that doesn't have any vtt or srt associated to them lol)
Windows 10
18.6.6
nVidia 3090 - 537.42
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Nov 08, 2022 7:21 pm

Btw, can "app.py" take the "--model_dir" arg so when it passes it to Whisper, then it doesn't download again the model (I have the medium and small ones in a specific folder, I like having these files in the same places :D).

No, it's using the default cache folder, since whisper is not the only package in the tool that requires a model. You could probably create symlinks in the cache folder to the folders where you have your models, but that's a bit impractical... The reason these folders are standardized is because it makes it easy for people to simply run a script without dealing with downloads, verifications etc.

Videoneth wrote:I really often have to search specific things in 200-400 videos files (which have each a .str file assoticated to them, and when they don't, I use Whisper for it.. but it's manual for each file, it would be create to have something ( ) that scan a folder and transcribe the that doesn't have any vtt or srt associated to them lol)

You can already batch transcribe multiple audio files by simply turning on the tool without Resolve and selecting multiple audio files for transcription.

Searching a folder of transcripts is coming in a future update. Again, I'm talking about semantic search, not just finding words - the "advanced search" that we have now for individual transcripts
Last edited by Octavian Mot on Tue Nov 08, 2022 7:24 pm, edited 1 time in total.
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Octavian Mot

  • Posts: 286
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Nov 08, 2022 7:30 pm

Videoneth wrote:Well, I've created a new virtual environnement, with 3.9.13, followed the installation process describe on github et and I get a module error : ModuleNotFoundError: No module named 'tkinter' -


The official Python installer should have that module - see instructions on where to download it.

Let's move this conversation in the windows installation issue page here: https://github.com/octimot/StoryToolkitAI/issues/9

This way we can keep the bm forum free of noise and help others install the tool if they run into similar problems!
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Videoneth

  • Posts: 1615
  • Joined: Fri Nov 13, 2020 11:03 pm
  • Warnings: 1
  • Real Name: Maxwell Allington

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Nov 08, 2022 7:31 pm

Octavian Mot wrote:
Videoneth wrote:Well, I've created a new virtual environnement, with 3.9.13, followed the installation process describe on github et and I get a module error : ModuleNotFoundError: No module named 'tkinter' -


The official Python installer should have that module - see instructions on where to download it.

Let's move this conversation in the windows installation issue page here: https://github.com/octimot/StoryToolkitAI/issues/9

This way we can keep the bm forum free of noise and help others install the tool if they run into similar problems!


I deleted my post because it was an user error! I unchecked the "tk" box while installing the 3.9 by mistake, went to fast during the installation lol

Btw, it works
Windows 10
18.6.6
nVidia 3090 - 537.42
Offline

Videoneth

  • Posts: 1615
  • Joined: Fri Nov 13, 2020 11:03 pm
  • Warnings: 1
  • Real Name: Maxwell Allington

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Nov 08, 2022 7:43 pm

The main window needs a scrollbar :D
It works, I'm testing now.

I really think an option should be added to specify a folder where everything gets downloaded (and used). And the config files too. Just a suggestion.

I used another tool, from a git repo, and everything is contained in one place, all the different model for upscale, face restoration, etc. from all the different sources the app uses.

Loading an srt file would be great too.

But good work man for this

ps.
where is the all-mpnet-base-v2 downloaded? I saw that it downloaded 458M when I tested the advanced search
Edit.. found it.. in USERS/.cache/torch/sentence_transformers

Yep, it would definitively be a + to get everything at one place. I think I'm gonna put a symlink there in the mean time :D
Windows 10
18.6.6
nVidia 3090 - 537.42
Offline

Videoneth

  • Posts: 1615
  • Joined: Fri Nov 13, 2020 11:03 pm
  • Warnings: 1
  • Real Name: Maxwell Allington

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Nov 08, 2022 8:46 pm

when we do a search, having an index generated with the relevant sentences and timestamps would be great too, easier to navigate after a serach.
Windows 10
18.6.6
nVidia 3090 - 537.42
Offline
User avatar

aindless

  • Posts: 30
  • Joined: Mon Oct 17, 2022 12:15 am
  • Location: Pitești, Romania
  • Real Name: Daniel Petre

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Nov 09, 2022 12:37 am

Thanks for the cool free software: i did not manage to integrate it with my DVR Studio (using python 3.10 on Windows 11 Pro) but it works transcribing the audios from my videos from romanian to youtube subtitles !
Awesome !
Using DaVinci Resolve Studio on Windows 11 (12700K - 64 gb ram - rtx3080 - ssd)
Offline

Robert Arnold

  • Posts: 443
  • Joined: Tue Oct 30, 2012 11:53 pm

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Nov 10, 2022 1:30 am

Pretty amazing! I'm using the standalone on Mac. The first thing I was testing was Chinese. I'm not impressed with that English translation (I have existing, human-made subs to compare it to), but it did a perfect job on some English, French, Spanish, and even Nepali!

It *does* in fact hallucinate when it encounters silence for a long time. I accidentally gave it a timeline that was missing audio, and it came up with this:

-------------

I have attached my sunflower seeds to possession.
So what do you say?
I'm fine, thank you.
Sure, I have a sunflower seed.
Let's go out to see it.

--------------

Thanks for building this tool!
Offline
User avatar

Uli Plank

  • Posts: 21291
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Nov 10, 2022 4:08 am

OMG, AI poetry from silence!
No, an iGPU is not enough, and you can't use HEVC 10 bit 4:2:2 in the free version.

Studio 18.6.5, MacOS 13.6.5
MacBook M1 Pro, 16 GPU cores, 32 GB RAM and iPhone 15 Pro
Speed Editor, UltraStudio Monitor 3G, iMac 2017, eGPU
Next

Return to DaVinci Resolve

Who is online

Users browsing this forum: Bing [Bot] and 153 guests