Free Transcriptions in Resolve using OpenAI Whisper

Get answers to your questions about color grading, editing and finishing with DaVinci Resolve.
  • Author
  • Message
Offline

Mario69Rossi

  • Posts: 88
  • Joined: Fri Jan 29, 2021 4:11 am
  • Real Name: Mario Rossi

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Dec 15, 2022 12:07 pm

I would like to check it out but I can't open the app.

I get the "StoryToolkitAI.0.16.18.M1.app" is damaged error.

So I've run:

Last login: Thu Dec 15 12:56:59 on ttys002
dav@Davs-MacBook-Pro /Applications % xattr -d com.apple.quarantine ./StoryToolkitAI.0.16.18.M1.app

xattr: No such file: ./StoryToolkitAI.0.16.18.M1.app
dav@Davs-MacBook-Pro /Applications %


Any ideas how I can open it?
DaVinci Resolve Studio 18.1
MacBook Pro M1 (2020) | 16GB ram | 1TB SSD
macOS Ventura
Offline

Nate Porter

  • Posts: 204
  • Joined: Wed May 01, 2013 8:22 am

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Dec 15, 2022 1:30 pm

This is an amazing project. I love everything that works so far and can't wait for further development. I'd love for this type of thing to exist natively in Resolve to just open the door for a lot of people that wouldn't be able to otherwise have access to it.
Offline

Mario69Rossi

  • Posts: 88
  • Joined: Fri Jan 29, 2021 4:11 am
  • Real Name: Mario Rossi

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Dec 15, 2022 7:04 pm

Mario69Rossi wrote:Any ideas how I can open it?

Sorted!
DaVinci Resolve Studio 18.1
MacBook Pro M1 (2020) | 16GB ram | 1TB SSD
macOS Ventura
Offline

Mario69Rossi

  • Posts: 88
  • Joined: Fri Jan 29, 2021 4:11 am
  • Real Name: Mario Rossi

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostFri Dec 16, 2022 11:23 am

I tried a few tests and the results I get are not better than YouTube generated subtitles. Also it can't reconize very well known brands like Lidl.

Timing is all over the place.
DaVinci Resolve Studio 18.1
MacBook Pro M1 (2020) | 16GB ram | 1TB SSD
macOS Ventura
Offline

Mario69Rossi

  • Posts: 88
  • Joined: Fri Jan 29, 2021 4:11 am
  • Real Name: Mario Rossi

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostFri Dec 16, 2022 12:34 pm

Any idea on how to improve the quality of the output?

At least I would like the timing to be correct, I think the fact that the AI model analyse the audio only is a problem as it doesn't properly split the captions based on the different clips. In my experience it generates very long captions that should be split over a few different clips instead
DaVinci Resolve Studio 18.1
MacBook Pro M1 (2020) | 16GB ram | 1TB SSD
macOS Ventura
Offline

Octavian Mot

  • Posts: 255
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostFri Dec 16, 2022 9:33 pm

Mario69Rossi wrote:Any idea on how to improve the quality of the output?

There are many optimizations to be made when you feed your footage for transcriptions, from audio quality all the way to the initial prompt you give to the AI. Just check out the project page on github as a lot of them are detailed there (but also older posts in this thread). Feel free to open up an issue on github for more detailed explanations regarding your particular setup.

But, generally speaking, if you follow the instructions and feed it the proper audio format, the tool should work well out of the box as many people have reported.

More updates and fixes are coming soon, but currently our entire team is stuck in production mode, so we're not in the editing room to develop and test...
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

trinderfilms

  • Posts: 4
  • Joined: Fri Dec 16, 2022 11:08 am
  • Real Name: Aaron Trinder

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostFri Dec 16, 2022 10:02 pm

Mario69Rossi wrote:I would like to check it out but I can't open the app.

I get the "StoryToolkitAI.0.16.18.M1.app" is damaged error.

So I've run:

Last login: Thu Dec 15 12:56:59 on ttys002
dav@Davs-MacBook-Pro /Applications % xattr -d com.apple.quarantine ./StoryToolkitAI.0.16.18.M1.app

xattr: No such file: ./StoryToolkitAI.0.16.18.M1.app
dav@Davs-MacBook-Pro /Applications %


Any ideas how I can open it?


Same with me... back to Descript!
Offline

Mario69Rossi

  • Posts: 88
  • Joined: Fri Jan 29, 2021 4:11 am
  • Real Name: Mario Rossi

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostSat Dec 17, 2022 4:55 pm

trinderfilms wrote:Same with me... back to Descript!


For me it worked at some point, I deleted the app again, downloaded and unzipped, copied in the applications folder but left the copy also in the downloads folder. Bot sure if it's coincidence or not but it worked.
DaVinci Resolve Studio 18.1
MacBook Pro M1 (2020) | 16GB ram | 1TB SSD
macOS Ventura
Offline

CougerJoe

  • Posts: 149
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostSun Dec 18, 2022 12:53 am

I did some testing a couple of weeks ago with various models, on a 3.5 hours podcast, They will give up transcoding after a period of time. some models stopped after 1 hour, another 2.5 hours and still another processed the entire file. Unfortunately I've lost my notes about the models. I think the only model that worked for the entire 3.5 hour vod was called english small.
Offline

Bruce L

  • Posts: 2
  • Joined: Mon Dec 12, 2022 4:06 pm
  • Real Name: Bruce Lichtenstein

Unable to find module DaVinciResolveScript from PYTHONPATH

PostMon Dec 19, 2022 4:18 pm

I have seen this error addressed before but I wonder if the version of Python matters?
I can not find 3.9 anymore so I used the latest 3.9.13. Is that OK?

I am on Windows 11 even though the log files refere to it as Windows 10

Platform: Windows 10
Platform version: 10.0.22621
OS: 10 10.0.22621 SP0 Multiprocessor Free ('', '', '')
running Python 3.9.13.final.0
-------------- (app.py:140)
2022-12-19 11:04:38,990 - StAI - DEBUG: All package requirements met. (app.py:160)
2022-12-19 11:04:38,990 - StAI - INFO: Running StoryToolkitAI version 0.17.5 (standalone) (app.py:8006)
2022-12-19 11:04:39,210 - StAI - DEBUG: Looking for ffmpeg in env variable. (app.py:8437)
2022-12-19 11:04:39,210 - StAI - DEBUG: FFMPEG_BINARY env variable is empty. Looking for ffmpeg in PATH. (app.py:8444)
2022-12-19 11:04:39,212 - StAI - DEBUG: Checking ffmpeg binary: C:\ProgramData\chocolatey\bin\ffmpeg.EXE (app.py:8456)
2022-12-19 11:04:39,415 - StAI - DEBUG: FFMPEG exit code: 1 (app.py:8461)
2022-12-19 11:04:39,416 - StAI - INFO: FFMPEG found at C:\ProgramData\chocolatey\bin\ffmpeg.EXE (app.py:8464)
2022-12-19 11:04:39,436 - StAI - INFO: Using cuda for Torch / Whisper. (app.py:5947)
2022-12-19 11:04:39,437 - StAI - DEBUG: MotsResolve module initialized. (mots_resolve.py:42)
2022-12-19 11:04:39,437 - StAI - DEBUG: Found DaVinci Resolve at the default location: C:\Program Files\Blackmagic Design\DaVinci Resolve\ (mots_resolve.py:142)
2022-12-19 11:04:39,437 - StAI - DEBUG: Unable to find module DaVinciResolveScript from PYTHONPATH - trying default locations next (mots_resolve.py:154)
Offline

Mario69Rossi

  • Posts: 88
  • Joined: Fri Jan 29, 2021 4:11 am
  • Real Name: Mario Rossi

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Dec 20, 2022 10:22 am

CougerJoe wrote:I did some testing a couple of weeks ago with various models, on a 3.5 hours podcast, They will give up transcoding after a period of time. some models stopped after 1 hour, another 2.5 hours and still another processed the entire file. Unfortunately I've lost my notes about the models. I think the only model that worked for the entire 3.5 hour vod was called english small.


Sometimes I run it and basically get no output just a general - How are you? - I'm fine thanks. So I run it again and I works the second time around.
DaVinci Resolve Studio 18.1
MacBook Pro M1 (2020) | 16GB ram | 1TB SSD
macOS Ventura
Offline

pantau000

  • Posts: 40
  • Joined: Wed Dec 21, 2022 5:42 pm
  • Real Name: Peter Antoni

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Dec 21, 2022 5:45 pm

thanks for this wonderful tool and the very precise instructions.

however, marker copying from/to timeline doesn't work for me, nothing happens...
Offline
User avatar

sturmen

  • Posts: 67
  • Joined: Mon Jul 29, 2019 3:53 pm
  • Location: New York, NY
  • Real Name: Nicholas Tinsley

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Dec 22, 2022 4:53 pm

This tool is fantastic, and I could not be more excited to see where future development takes us. One quick question: Can I sponsor an application icon? It appears to have the default Xcode template icon on macOS.
MacBook Pro, 16", M1 Max 64GB
Offline

mW_Philipp

  • Posts: 1
  • Joined: Fri Dec 23, 2022 12:02 pm
  • Real Name: Philipp Noskin

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostFri Dec 23, 2022 12:18 pm

Hey

Is it possible to add as a feature that it splits the transcribe sections into the clips I have in the timeline?
That would be incredibly useful for me to edit short heavily cutted videos with fast changing subtitles.
Offline

EyeFly

  • Posts: 4
  • Joined: Wed Jan 09, 2019 4:22 pm
  • Real Name: Oliver Dadswell

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostSat Dec 24, 2022 1:16 am

Wonderful resource! Thanks for sharing
Offline

pantau000

  • Posts: 40
  • Joined: Wed Dec 21, 2022 5:42 pm
  • Real Name: Peter Antoni

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostSat Dec 24, 2022 1:12 pm

pantau000 wrote:marker copying from/to timeline doesn't work for me, nothing happens...


Anyone else experiencing this problem?
Offline

rnbaker

  • Posts: 58
  • Joined: Thu Aug 22, 2019 10:40 am
  • Real Name: Ray Baker

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostSun Dec 25, 2022 3:28 am

Mario69Rossi wrote:Any idea on how to improve the quality of the output?

At least I would like the timing to be correct, I think the fact that the AI model analyse the audio only is a problem as it doesn't properly split the captions based on the different clips. In my experience it generates very long captions that should be split over a few different clips instead


I did an experiment a few weeks back on a Windows machine and found that if you use both the CPU and GPU (the "Auto" setting) then the long captions don't happen and both the punctuation and capitalization are excellent but the one thing that is far from perfect is the timing, and if the timing is off (and sometimes the timing wasn't off during parts of a given project but the majority of the time it was off), the difference from what it should be always was in the range of +/- .25-2.5 seconds, at least, as far as I could tell.

However, if you choose the GPU only ("CUDA"), the timing is 100% or is very, very close to 100% accurate but will create very long captions and make many more errors in punctuation (including identifying when to start and end a sentence) and capitalization.

At least, that is what happened for me with two longer projects (both 4+ hours each) that I experimented with. There was also a slight difference between using the large model and medium English model but not much of a difference if I remember correctly (and I intended to post about this much sooner when the details were more fresh in my mind but unfortunately I got busy with other things).

@Octavian Mot
Offline
User avatar

Leslie Wand

  • Posts: 661
  • Joined: Wed Jul 24, 2013 5:56 am
  • Location: rural nsw, australia

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostSun Dec 25, 2022 6:27 am

whatever, octavian - a terrific effort and amazing integration. a very heartfelt thanks.

stay well, and all the very best for the new year.
www.lesliewand.com.au
amd5 5800x / 32gb ram / rtx 3050 8gb / win 10 pro
Offline

pantau000

  • Posts: 40
  • Joined: Wed Dec 21, 2022 5:42 pm
  • Real Name: Peter Antoni

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostSun Dec 25, 2022 2:29 pm

pantau000 wrote:marker copying from/to timeline doesn't work for me, nothing happens...


Well, actually something does happen... Clicking "Copy Timeline Markers to Same Clip", the markers on the timeline get copied from the timeline in the viewer to the timeline itself, but not onto the clip on the timeline.

I don't really understand what this is good for, and what's the difference between the timeline markers on the active timeline, and the markers of the same timeline in the media pool.

Would be great if somebody could explain this to me.
Offline

Octavian Mot

  • Posts: 255
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostMon Dec 26, 2022 7:37 am

Thanks again for the great feedback and sorry for the lack of response. We just finished a production and slowly coming back to the editing room in the first week of January. More updates on the tool are coming soon!

I don't really understand what this is good for, and what's the difference between the timeline markers on the active timeline, and the markers of the same timeline in the media pool.

The purpose of the copy markers functions is to copy markers either between the tineline and it's corresponding source clip in the media pool, see more info here: https://github.com/octimot/StoryToolkit ... -same-clip

We use 3 and 4 point editing techniques and finding markers on the source clip is really helpful. Also, if you copy markers from timeline to source, you're able to search for them in the media pool of Resolve. In other words, we use StoryToolkit to mark timelines via content, then copy markers to source clip.

In a very soon update, we will also enable semantic search on Resolve markers.

@rnbaker
We'll look into it asap. The new Whisper large v2 model seems to also be better. I think you just have to delete the old model from your cache and the tool should download the new one, but we'll check that and come back with instructions soon.

As I was saying earlier, transcriptions are means to an end - the end being really good footage search and a real AI assistant editor that knows exactly what is where to help folks edit faster - frame precision is not really important for that. But I realize there's a lot of use from transcriptions and precise timed subtitles for many people and will definitely improve that with a few upcoming updates.
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

pantau000

  • Posts: 40
  • Joined: Wed Dec 21, 2022 5:42 pm
  • Real Name: Peter Antoni

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostMon Dec 26, 2022 6:03 pm

Octavian Mot wrote:The purpose of the copy markers functions is to copy markers either between the tineline and it's corresponding source clip in the media pool


The possibility to copy timeline markers to the source clip is exactly what I am looking for in my workflow. However, either I get something completely wrong, or it doesn't seem to work this way. This is a screenshot of my timeline and media bin before hitting the timeline-to-clip button:

resolve1.jpg
resolve1.jpg (325.55 KiB) Viewed 2193 times


And this is how it looks after:

resolve2.jpg
resolve2.jpg (384.94 KiB) Viewed 2193 times


So the result is that the markers are copied to the timeline itself, and not the source clip, right?
Offline

rnbaker

  • Posts: 58
  • Joined: Thu Aug 22, 2019 10:40 am
  • Real Name: Ray Baker

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostMon Dec 26, 2022 6:12 pm

Octavian Mot wrote:
@rnbaker
We'll look into it asap. The new Whisper large v2 model seems to also be better. I think you just have to delete the old model from your cache and the tool should download the new one, but we'll check that and come back with instructions soon.

As I was saying earlier, transcriptions are means to an end - the end being really good footage search and a real AI assistant editor that knows exactly what is where to help folks edit faster - frame precision is not really important for that. But I realize there's a lot of use from transcriptions and precise timed subtitles for many people and will definitely improve that with a few upcoming updates.


Yep, no worries, I was just reporting my findings that "Auto" is better for punctuation, capitalization, sentence identification, and normal subtitle length, while "CUDA" has much better timing. I was mainly just relaying my findings to Mario and decided to let you know, too. I totally understand and if it stays how it is, it's ok by me. Thanks again and happy holidays!
Offline

Octavian Mot

  • Posts: 255
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Dec 27, 2022 10:05 am

pantau000 wrote:So the result is that the markers are copied to the timeline itself, and not the source clip, right?


Right! Now I realize what you mean! Yes, that is the wanted behaviour.

For us, "a source clip" is a clip that is opened into the "source viewer", therefore, if the timeline is opened in the source viewer it technically becomes a source clip (or source timeline, since the type is a timeline).

In other words, if the markers are visible in the source viewer or in the media panel, we're dealing with "source clip markers", and that "clip" is the one that has the same name as the timeline that is active in the timeline tab. If the markers are visible in the timeline tab, we're dealing with "timeline markers".

Really going into definitions rabbit hole now, but I hope it makes sense - we're basically using the Resolve manual to classify these things.

What you're probably looking for is a "copy markers from timeline to timeline clips" function! And then a function that "copies timeline clips markers to source clips"! :D
Feel free to open an issue on the Github page for this feature and we might squeeze it some day.
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Mario69Rossi

  • Posts: 88
  • Joined: Fri Jan 29, 2021 4:11 am
  • Real Name: Mario Rossi

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Dec 27, 2022 11:32 am

Octavian Mot wrote:As I was saying earlier, transcriptions are means to an end - the end being really good footage search and a real AI assistant editor that knows exactly what is where to help folks edit faster - frame precision is not really important for that. But I realize there's a lot of use from transcriptions and precise timed subtitles for many people and will definitely improve that with a few upcoming updates.

That's what I use it for to simply have subtitles. At the moment I find that the single captions created are way too long and the timing is really off, especially when you have gaps between talking points like for example an hyper-laps.

Thanks for your work, perhaps you could open up donations. If all the users send some donations it could help the timing you spend with this project.
DaVinci Resolve Studio 18.1
MacBook Pro M1 (2020) | 16GB ram | 1TB SSD
macOS Ventura
Offline

pantau000

  • Posts: 40
  • Joined: Wed Dec 21, 2022 5:42 pm
  • Real Name: Peter Antoni

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Dec 27, 2022 5:38 pm

Octavian Mot wrote:Right! Now I realize what you mean! Yes, that is the wanted behaviour. What you're probably looking for is a "copy markers from timeline to timeline clips" function!


Thanks very much, now I understand...

Exactly, what I'm looking for is the possibility to "copy timeline clips markers to source clips". Let me explain why: working with foreign language documentary material, I need a workflow that links subtitles to the original source clips, not to the finished timeline, as the editor(s) need translated subtitles to understand what is happening on the clips to be able to edit them.

As far as I can see, the only way to achieve this is to work work with duration markers, which can either be displayed during playback or somehow burned into proxy source clips.

Octavian Mot wrote:Feel free to open an issue on the Github page for this feature and we might squeeze it some day.


Good advice, I'll do so.
Offline

CougerJoe

  • Posts: 149
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostTue Dec 27, 2022 11:52 pm

Mario69Rossi wrote:That's what I use it for to simply have subtitles. At the moment I find that the single captions created are way too long and the timing is really off, especially when you have gaps between talking points like for example an hyper-laps.

.


Yes, that's my problem too, a person may speak for 20-30 seconds, and the subtitle is for everything they said for that duration in a single instance subtitle. If that is intentional, I feel it would be better to break it up sentence by sentence.

I also saw something interesting where the person repeated themselves, something like 'So afterwards we'll go to dinner, afterwards we'll go to dinner' , and the subs don't repeat themselves like the person did. I wonder if that's some sort of error control, or for the sake of the subs to reduce clutter it's an active decision to remove a person repeating themselves. I don't think it should be making that decision if true, just transcribe everything factually.
Offline

Mario69Rossi

  • Posts: 88
  • Joined: Fri Jan 29, 2021 4:11 am
  • Real Name: Mario Rossi

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Dec 28, 2022 11:06 am

CougerJoe wrote:Yes, that's my problem too, a person may speak for 20-30 seconds, and the subtitle is for everything they said for that duration in a single instance subtitle. If that is intentional, I feel it would be better to break it up sentence by sentence.


Yeah some captions are very long but also the timing get messed up. For example video play clip A and alredy there is the caption for clip B. and when there is a long gap between talking poinys there is giant caption created. At the moment cleaning up the surtitles generated with this tool takes me more or less the same time than to clean up the ones generated by YouTube but there are some advantages with this tool like not relying on YouTube connectivity.

CougerJoe wrote:I also saw something interesting where the person repeated themselves, something like 'So afterwards we'll go to dinner, afterwards we'll go to dinner' , and the subs don't repeat themselves like the person did. I wonder if that's some sort of error control, or for the sake of the subs to reduce clutter it's an active decision to remove a person repeating themselves. I don't think it should be making that decision if true, just transcribe everything factually.


I also noticed this and I like it because it cleans up the subtitles for me and remove unnecessary wording. For exmaple I spend a long time on YouTube auto generated subtitle to remove hundreds of "hu". I really like this behavior. Also the fact that it can reconize different languages. For example my videos are in English but travelling I speak to native people in their language a few sentences say in Spanish and this tool transcribe them in English. On the other hand I had a video all in English but it stated me saying Buenos Dias as I was in Spain and it transcribed the whole 20 mins video in Spanish even though all the talking was in English except the first two words. I had to transcribe again and I selected a .en model instead of the generic one.
DaVinci Resolve Studio 18.1
MacBook Pro M1 (2020) | 16GB ram | 1TB SSD
macOS Ventura
Offline

Octavian Mot

  • Posts: 255
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Dec 28, 2022 12:05 pm

@CougerJoe @Mario69Rossi

A lot of the stuff you're describing has been discussed either in one of the open/closed issues on Github or detailed in the tool instructions (main Github page). For eg. source language, prompting, benefits of using different models, outputs, precision, hallucinations etc.

I feel that a lot of it can be prevented, but it really depends on your individual use case too.

I'd love to help more, but it would be much easier to give you relevant support if you either continue an issue that has already been raised on Github or simply open a new one if you think it's different that what you're experiencing. This way, we're also able to help others who work with the tool, since a lot of the users of the tool are actually not active on this forum but tend to read/contribute on Github. Also, having stuff centralized on Github helps us understand people's needs and uses, and what to focus on in the future.
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Rick van den Berg

  • Posts: 1177
  • Joined: Tue Jun 02, 2015 7:47 am

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Dec 28, 2022 1:42 pm

Pff, i cannot get this to install properly, i'm already stuck at the "install virtualenv" step..
it says
"No suitable Python runtime found
Pass --list (-0) to see all detected environments on your machine
or set environment variable PYLAUNCHER_ALLOW_INSTALL to use winget
or open the Microsoft Store to the requested version."

I tried skipping it but that seems to really screw things up. Guess i'm one of the people who really need a single lazy install button
Online
User avatar

Robert Niessner

  • Posts: 4379
  • Joined: Thu Feb 21, 2013 9:51 am
  • Location: Graz, Austria

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Dec 28, 2022 2:52 pm

Rick, I have written some steps for the installation which got me up and running:

viewtopic.php?p=906109#p906109

Have a look if it helps.
Saying "Thx for help!" is not a crime.
--------------------------------
Robert Niessner
LAUFBILDkommission
Graz / Austria
--------------------------------
Blackmagic Camera Blog (German):
http://laufbildkommission.wordpress.com

Read the blog in English via Google Translate:
http://tinyurl.com/pjf6a3m
Offline

Octavian Mot

  • Posts: 255
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Dec 28, 2022 2:55 pm

Rick van den Berg wrote:Pff, i cannot get this to install properly, i'm already stuck at the "install virtualenv"


Have you tried the standalone version? You don't need to install Python for that:
https://github.com/octimot/StoryToolkitAI/releases
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Rick van den Berg

  • Posts: 1177
  • Joined: Tue Jun 02, 2015 7:47 am

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Dec 29, 2022 9:13 pm

Hi Robert, i followed all of your steps, with no weird errors or whatsoever in the process, but i eventually get this cmd window, after which it just disappears:
Screenshot (156).png
Screenshot (156).png (818.01 KiB) Viewed 1808 times


@Octavian
This looks like exactly the same file as Robert linked to in his description

Are you sure that's a standalone version? Anyway, when i double click the .exe file, i get the same cmd window as above, and then it just closes after a few seconds.
Offline
User avatar

Jacob Olivera

  • Posts: 34
  • Joined: Mon Mar 30, 2015 5:51 pm

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostMon Jan 09, 2023 11:04 am

This looks amazing,
Thanks for making this !

One thing though, could this just work, like as an extension you install ? What is this installing Python, Homebrew stuff mess ?! I'm never gonna do that ! It needs to be a drag and drop thing you put in a folder and you see a new icon in Resolve and let's go ! If Steve Jobs was there he would agree...! :)
Offline

Uli Plank

  • Posts: 16602
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostMon Jan 09, 2023 11:28 am

Some of the best things from the open source scene don't work like that.
As polished and shiny as Apple stuff, I mean.
Don’t approach DR with your preconceptions from another NLE.
Many features are better, some worse, most are different.


Resolve Studio 18.1.2, MacOS 12.6.2
MacBook M1 Pro, 16 GPU cores, 32 GB RAM
and
iMac 2017, Radeon 580, 32 GB RAM
Speed Editor
Offline

CougerJoe

  • Posts: 149
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Jan 25, 2023 4:58 am

Rick van den Berg wrote:Hi Robert, i followed all of your steps, with no weird errors or whatsoever in the process, but i eventually get this cmd window, after which it just disappears:
Screenshot (156).png


@Octavian
This looks like exactly the same file as Robert linked to in his description

Are you sure that's a standalone version? Anyway, when i double click the .exe file, i get the same cmd window as above, and then it just closes after a few seconds.


Don't follow Robert's guide. He's installing python when it's not needed infact I believe that could be your problem. had the same symptoms as you, I fixed it by removing miniconda, but you could try removing any python install if it's not necessary.

There is no need to reinstall this whisper app afterwards everything just starts working. For people that know Python, they may know immediately what's going on and even know a way of keeping python installed by changing a settings.

If you don't have python installed then it's some other problem
Offline

t_hayash

  • Posts: 1
  • Joined: Fri Nov 20, 2020 11:13 pm
  • Real Name: Ted Hayash

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Jan 25, 2023 10:20 am

This tool would help me so much if I could get it to work - I'm running Mac OS 12.6 on an M1 laptop and Resolve 18.1.2, and am able to get the StoryToolKit to run, but it doesn't seem to connect to Resolve. I've tried rebooting, uninstalling the old versions of Python that were leftover on my machine, and redownloading from the GitHub repository.

Any hints?

Thank you so much in advance!
Attachments
Screen Shot 2023-01-25 at 2.19.56 AM.png
Screen Shot 2023-01-25 at 2.19.56 AM.png (871.85 KiB) Viewed 1281 times
Online
User avatar

Robert Niessner

  • Posts: 4379
  • Joined: Thu Feb 21, 2013 9:51 am
  • Location: Graz, Austria

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Jan 25, 2023 12:13 pm

CougerJoe wrote:Don't follow Robert's guide. He's installing python when it's not needed infact I believe that could be your problem. had the same symptoms as you, I fixed it by removing miniconda, but you could try removing any python install if it's not necessary.

There is no need to reinstall this whisper app afterwards everything just starts working. For people that know Python, they may know immediately what's going on and even know a way of keeping python installed by changing a settings.

If you don't have python installed then it's some other problem


Well, in my case I had to install Phyton otherwise StoryToolkitAI would not run on my system.
I tested all those steps out before writing the guide.

But it's worth a try to uninstall Phyton then to see if that helps.
Saying "Thx for help!" is not a crime.
--------------------------------
Robert Niessner
LAUFBILDkommission
Graz / Austria
--------------------------------
Blackmagic Camera Blog (German):
http://laufbildkommission.wordpress.com

Read the blog in English via Google Translate:
http://tinyurl.com/pjf6a3m
Offline

AdamSvatek

  • Posts: 2
  • Joined: Wed Jan 25, 2023 7:08 pm
  • Real Name: Adam Svatek

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Jan 25, 2023 7:21 pm

Hi,

Thank you for making this and I am hoping to get this to work. I would love to finally get away from AVID. Transcription is the last tool I need for Resolve.

I have installed everything and the installation worked fine. When I open the app, I do not see a GUI that would allow me to take the first steps to transcribing my timeline. Does a GUI pop open automatically or am I missing something due to user error. Is there a script somewhere in Resolve I need to click or open now? What happens is, it opens IDLE with a window containing a bunch of code (I don't know code)..at all....I click on the "run Module" from the menu and I get this:

Python 3.11.1 (v3.11.1:a7a450f84a, Dec 6 2022, 15:24:06) [Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license()" for more information.

=========== RESTART: /Applications/StoryToolKit/StoryToolkitAI/app.py ==========
Traceback (most recent call last):
File "/Applications/StoryToolKit/StoryToolkitAI/app.py", line 20, in <module>
import torch
ModuleNotFoundError: No module named 'torch'

I have an NVIDA GPU but it's not CUDA...I went back and installed

pip uninstall torch
pip cache purge
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116

just in case...still not seeing what I would to assume be the GUI that is in the help folder by the ap.

any help would be appreciated!
Thanks
Adam
Offline

CougerJoe

  • Posts: 149
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Jan 26, 2023 1:22 am

Robert Niessner wrote:
Well, in my case I had to install Phyton otherwise StoryToolkitAI would not run on my system.
I tested all those steps out before writing the guide.

But it's worth a try to uninstall Phyton then to see if that helps.


You could be right, because I initially did try installing the python version of this app but had problems. But that install is probably still there. I should be more precise. I was running the Windows Cuda version of StoryToolKit, it worked great, I then tried another whisper app that required miniconda, turns out it only uses CPU so wasn't interested. It's then I discovered this app no longer initiated correctly. I removed miniconda, restarted OS , and StoryToolkitAI worked again.
Offline

AdamSvatek

  • Posts: 2
  • Joined: Wed Jan 25, 2023 7:08 pm
  • Real Name: Adam Svatek

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Jan 26, 2023 5:26 am

AdamSvatek wrote:Hi,

Thank you for making this and I am hoping to get this to work. I would love to finally get away from AVID. Transcription is the last tool I need for Resolve.

I have installed everything and the installation worked fine. When I open the app, I do not see a GUI that would allow me to take the first steps to transcribing my timeline. Does a GUI pop open automatically or am I missing something due to user error. Is there a script somewhere in Resolve I need to click or open now? What happens is, it opens IDLE with a window containing a bunch of code (I don't know code)..at all....I click on the "run Module" from the menu and I get this:

Python 3.11.1 (v3.11.1:a7a450f84a, Dec 6 2022, 15:24:06) [Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license()" for more information.

=========== RESTART: /Applications/StoryToolKit/StoryToolkitAI/app.py ==========
Traceback (most recent call last):
File "/Applications/StoryToolKit/StoryToolkitAI/app.py", line 20, in <module>
import torch
ModuleNotFoundError: No module named 'torch'

I have an NVIDA GPU but it's not CUDA...I went back and installed

pip uninstall torch
pip cache purge
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116

just in case...still not seeing what I would to assume be the GUI that is in the help folder by the ap.

any help would be appreciated!
Thanks
Adam



Nevermind. I figured it out. Works great now. After seeing it in action I would request Speakers and Visible timecode. Where's the best place to add feature requests? I wish I knew how to code to help on this project. This works well enough that I am comfortable using this for jobs. Great work to all involved!!!!
Online
User avatar

Robert Niessner

  • Posts: 4379
  • Joined: Thu Feb 21, 2013 9:51 am
  • Location: Graz, Austria

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Feb 01, 2023 11:17 am

There is now a modified version of OpenAI Whisper available: WhisperX

https://github.com/m-bain/whisperX

This repository refines the timestamps of openAI's Whisper model via forced aligment with phoneme-based ASR models (e.g. wav2vec2.0), multilingual use-case.

Whisper is an ASR model developed by OpenAI, trained on a large dataset of diverse audio. Whilst it does produces highly accurate transcriptions, the corresponding timestamps are at the utterance-level, not per word, and can be inaccurate by several seconds.

Phoneme-Based ASR A suite of models finetuned to recognise the smallest unit of speech distinguishing one word from another, e.g. the element p in "tap". A popular example model is wav2vec2.0.

Forced Alignment refers to the process by which orthographic transcriptions are aligned to audio recordings to automatically generate phone level segmentation.


It will allow to get precise timecodes and speaker recognition.
Before you are getting too excited - it's author says it needs further testing and also Octavian would need to do some test and integration into his StoryToolkitAI
Saying "Thx for help!" is not a crime.
--------------------------------
Robert Niessner
LAUFBILDkommission
Graz / Austria
--------------------------------
Blackmagic Camera Blog (German):
http://laufbildkommission.wordpress.com

Read the blog in English via Google Translate:
http://tinyurl.com/pjf6a3m
Offline

Octavian Mot

  • Posts: 255
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Feb 01, 2023 11:47 am

We just wrapped a long production and looking forward to continue development on the tool starting next week.

There are some things we want to implement fast, but also a bunch of fixes that are long due! Thanks for your patience!
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Videoneth

  • Posts: 972
  • Joined: Fri Nov 13, 2020 11:03 pm
  • Real Name: Maxwell Allington

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Feb 02, 2023 2:53 pm

Robert Niessner wrote:There is now a modified version of OpenAI Whisper available: WhisperX

https://github.com/m-bain/whisperX

This repository refines the timestamps of openAI's Whisper model via forced aligment with phoneme-based ASR models (e.g. wav2vec2.0), multilingual use-case.

Whisper is an ASR model developed by OpenAI, trained on a large dataset of diverse audio. Whilst it does produces highly accurate transcriptions, the corresponding timestamps are at the utterance-level, not per word, and can be inaccurate by several seconds.

Phoneme-Based ASR A suite of models finetuned to recognise the smallest unit of speech distinguishing one word from another, e.g. the element p in "tap". A popular example model is wav2vec2.0.

Forced Alignment refers to the process by which orthographic transcriptions are aligned to audio recordings to automatically generate phone level segmentation.


It will allow to get precise timecodes and speaker recognition.
Before you are getting too excited - it's author says it needs further testing and also Octavian would need to do some test and integration into his StoryToolkitAI


That's really cool!
I think I'll install it next to my current Whisper to test.

If this gets implemented in StoryToolkitAI, it will make the perfect transcription tool!

Is it from the same people who wrote this scripts? : https://github.com/jianfch/stable-ts
Windows 10
18.1.2
nVidia 3090 - 527.56
Offline

Andy Mees

  • Posts: 2226
  • Joined: Wed Aug 22, 2012 7:48 am

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostSun Feb 05, 2023 5:52 pm

Here's what's going on in Adobe land.
TBE GIF_4.gif
TBE GIF_4.gif (172.21 KiB) Viewed 777 times
Let's have a return to the glory days, when press releases for new versions included text like "...with over 300 new features and improvements that professional editors and colorists have asked for."
Offline

CougerJoe

  • Posts: 149
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostSun Feb 05, 2023 10:08 pm

Andy Mees wrote:Here's what's going on in Adobe land.
TBE GIF_4.gif


Possibly a lot more accurate as I read, and assuming this isn't a typo, each language dictionary is 650GB, while the large multilingual dictionary on Whisper is only 3GB. Also It doesn't look like Adobe does translations yet, you'd have to use google translate or similar on the transcribed language you want to translate, which is going to make it sound like google translate which doesn't sound very human
Offline

Paddywack0

  • Posts: 29
  • Joined: Wed Sep 01, 2021 6:04 pm
  • Real Name: Nick Elborough

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostMon Feb 06, 2023 11:29 am

Been using this tool since November and it has transformed my work. So thanks Octavian. Thanks for the quick addition of timecode stamps back then. It is a gamechanger in my workflow.
I think I mentioned on the Github but another great addition would be the ability to save a particular config for StorykitAI and have the option of automatically process the files rather than waiting for user input once it has received the wav file from Resolve. That way the transcription is a one button press away.

Thanks again Nick
Windows 10 Pro - Version Build 19043.1766
HP z820 24core Dual 3.5 Xeon E5
128Gb Memory
Quadro M4000 graphics
Offline

Videoneth

  • Posts: 972
  • Joined: Fri Nov 13, 2020 11:03 pm
  • Real Name: Maxwell Allington

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Feb 08, 2023 2:34 pm

I just saw an update for 17.7, what's new?
Windows 10
18.1.2
nVidia 3090 - 527.56
Offline

Octavian Mot

  • Posts: 255
  • Joined: Mon Aug 25, 2014 2:42 pm
  • Location: Germany

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostWed Feb 08, 2023 3:49 pm

We pushed an update only for the non-standalone version that adds support for the new Whisper module and large-v2 model, but also includes a few bug fixes. The large-v2 model is better in some cases compared with the large-v1 model (especially for non-english languages). But, the large v1 model is also still available as an option in the Transcription Settings window. Feel free to give it a run and let us know if something doesn't work as it used to!

More updates are coming in the weeks ahead and hopefully we'll be able to pack them into standalone releases for MacOS and Windows soon!
Trying to keep it together at mots.us
Taming AI for filmmaking at StoryToolkit.ai
Offline

Sander de Regt

  • Posts: 2844
  • Joined: Thu Nov 13, 2014 10:09 pm

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Feb 09, 2023 11:10 am

On the websit/git it says that the tool will also (maybe) run in standalone mode. But I can't seem to find how to use it in standalone mode. Am I overlooking some info somewhere? I'm pretty new to gitting things done. I downloaded the exe archive because I thought that would be the easiest way to get started, but apart from the CMD window opening and closing by itself I don't see much going on.
Sander de Regt

ShadowMaker SdR
The Netherlands
Offline

Andrew Kolakowski

  • Posts: 8845
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Free Transcriptions in Resolve using OpenAI Whisper

PostThu Feb 09, 2023 11:41 am

Mario69Rossi wrote:Any idea on how to improve the quality of the output?

At least I would like the timing to be correct, I think the fact that the AI model analyse the audio only is a problem as it doesn't properly split the captions based on the different clips. In my experience it generates very long captions that should be split over a few different clips instead


This is next step. Analyse of the speech is one thing and making it a captions is another (maybe even more difficult). Telestream (Timed Text Speech) done it, but not sure how good end results are. Not free, but not crazy expensive either (pay per minute 0.1$).
Previous

Return to DaVinci Resolve

Who is online

Users browsing this forum: Dave Shortman, Jacky Rieger, Michael Kropfberger, panos_mts, xdotcommer and 88 guests