The words you are reading have not been produced by Generative AI. They’re entirely my own.

The role of Generative AI

The only parts of what you’re reading that Generative AI has played a role in are the punctuation and the paragraphs, as well as the headings.

Challenges for an academic

I have to write a lot for my job; I’m an academic, and I’ve been trying to find a way to make ChatGPT be useful for my work. Unfortunately, it’s not really been useful at all. It’s useless as a way to find references, except for the most common things, which I could just Google anyway. It’s really bad within my field and just generates hallucinations about every topic I ask it about.

The limited utility in writing

The generative features are useful for creative applications, like playing Dungeons and Dragons, where accuracy isn’t important. But when I’m writing a formal email to my boss or a student, the last thing I want is ChatGPT’s pretty awful style, leading to all sorts of social awkwardness. So, I had more or less consigned ChatGPT to a dusty shelf of my digital life.

A glimmer of potential

However, it’s a new technology, and I figured there must be something useful about it. Certainly, people have found it useful for summarising articles, and it isn’t too bad for it. But for writing, that’s not very useful. Summarising what you’ve already written after you’ve written it, while marginally helpful, doesn’t actually help with the writing part.

The discovery of WhisperAI

However, I was messing around with the mobile application and noticed that it has a speech-to-text feature. It’s not well signposted, and this feature isn’t available on the web application at all, but it’s not actually using your phone’s built-in speech-to-text. Instead, it uses OpenAI’s own speech-to-text called WhisperAI.

Harnessing the power of WhisperAI

WhisperAI can be broadly thought of as ChatGPT for speech-to-text. It’s pretty good and can cope with people speaking quickly, as well as handling large pauses and awkwardness. I’ve used it to write this article, and this article isn’t exactly short, and it only took me a few minutes.

The technique and its limitations

Now, the way you use this technique is pretty straightforward. You say to ChatGPT, “Hey, I’d like you to split the following text into paragraphs and don’t change the content.” It’s really important you say that second part because otherwise, ChatGPT starts hallucinating about what you said, and it can become a bit of a problem. This is also an issue if you try putting in too much at once. I found I can get to about 10 minutes before ChatGPT either cuts off my content or starts hallucinating about what I actually said.

The efficiency of the method

But that’s fine. Speaking for about 10 minutes straight about a topic is still around 1,200 words if you speak at 120 words per minute, as is relatively common. And this is much faster than writing by hand is. Typing, the average typing speed is about 40 words per minute. Usually, up to around 100 words per minute is not the strict upper limit but where you start getting diminishing returns with practice.

The reality of writing speed

However, I think we all know that writing, it’s just not possible to write at 100 words per minute. It’s much more common for us to write at speeds more like 20 words per minute. For myself, it’s generally 14, or even less if it’s a piece of serious technical work.

Unrivaled first draft generation

Admittedly, using ChatGPT as fancy dictation isn’t really going to solve the problem of composing very exact sentences. However, as a way to generate a first draft, I think it’s completely unrivaled. You can talk through what you want to write, outline the details, say some phrases that can act as placeholders for figures or equations, and there you go.

Revolutionizing the writing process

You have your first draft ready, and it makes it viable to actually do a draft of a really long report in under an hour, and then spend the rest of your time tightening up each of the sections with the bulk of the words already written for you and the structure already there. Admittedly, your mileage may vary.

A personal advantage

I do a lot of teaching and a lot of talking in my job, and I find that a lot easier. I’m also neurodivergent, so having a really short format helps, and being able to speak really helps me with my writing.

Seeking feedback

I’m really curious to see what people think of this article. I’ve endeavored not to edit it at all, so this is just the first draft of how it came out of my mouth. I really want to know how readable you think this is. Obviously, there might be some inaccuracies; please feel free to point them out where there are strange words. I’d love to hear if anyone is interested in trying this out for their work. I’ve only been messing around with this for a week, but honestly, it’s been a game changer. I’ve suddenly looked to my colleagues like I’m some kind of super prolific writer, which isn’t quite the case. Thanks for reading, and I’ll look forward to hearing your thoughts.

(Edit after dictation/processing: the above is 898 words and took about 8min 30s to dictate ~105WPM.)

  • @GSV_Spinnaker
    link
    5
    edit-2
    1 year ago

    I suppose my big annoying soapbox opinion is one I’ve had with every other accidentally useful application of ChagGPT - why the hell are we using LLMs to do this? Splitting text into paragraphs can surely be done with a much simpler NLP model (i.e. one that doesn’t require a GPU per user), and it’s not like speech-to-text is new.

    I imagine you could use a simple model with ‘good enough’ accuracy, and then have some basic keybinds to very quickly fix up the rest of it. Goto-next-paragraph, goto-previous-paragraph, move-sentence-forward, and move-sentence-backwards would do the job.

    N.B. The general process is cool though! I’m neurodivergent like you and I’ve spent a lot of time lately thinking about how to make the actual process of note taking as seamless as possible. I’ve found that reducing the barriers between me and the metaphorical paper has really increased how likely I am to write my thoughts down.

    • @selfMA
      link
      61 year ago

      I’ve noticed this pattern all over in the AI hype cycle too — well-known and efficient techniques are either ignored in favor of something extremely wasteful or are rebranded to appear new. I’m actually starting to wonder how many AI startups use a data pipeline of existing techniques that incorporates an LLM step that’s effectively a no-op, or very close to one

      • @froztbyte
        link
        41 year ago

        probably a lot

        one very big part of this is the executive-tier push for “AI”. it’s not because AI, it’s purely social. “everyone else is doing it” and a lot of execs will literally fall over themselves to get something delivered in that manner. because they don’t actually understand the thing in comparison to other shit.

        I cite the way “the metaverse” bubbled and fizzed as my supporting datapoint here, along with n-many other bullshit hype cycles in the past

        • David GerardA
          link
          51 year ago

          and “blockchain”. Every now and then there’s a writeup in the finance press of some company that is totally using blockchain to move real money around, and in 100% of cases i’ve looked into the blockchain bit gets relegated to logging or removed entirely while MSSQL, Oracle or Postgres do the work. Of course, it’s still advertised as blockchain.

          • SteveM
            link
            41 year ago

            The weird thing about this is how the sales pitch for enterprise blockchain has not changed at all. Pick a presentation on YouTube from the last 6 months and I guarantee it will be identical to one from 5 years ago

            • SteveM
              link
              61 year ago

              I dug into my draft scraps to find this bit I wrote about it. It was never published so when it continues to be true you have to trust my word that I wrote it on the 19th of February 2023.

              A good example is to watch any presentation from the late 2010s about how the blockchain is going to change the food supply chain or some shit and it's going to help all these nondescript poor farmers who are being taken advantage of by these big corporations. They speak with a lot of passion and there is an undeniably strong sense that they believe the things they are saying. The problem becomes clear when you watch the second presentation. Pick one about blockchain in a different industry, say financial services, and you're likely to see the same template. Properties-the trust machine, immutability, decentralisation-followed by an emotionally charged story about a helpless individual in a spot of avoidable strife_-followed by an inconclusive conclusion which assumes you can clearly see how blockchain fixes it. A common characteristic of that emotionally charged story, I find, is that it describes a scenario that seems to point more to an incompetently managed process which somehow managed to forget that computers and the internet exists and attempts to run a cross-continent process with pencils, paper, and airmail. It's like a magician's act where the sleight of hand distracts you with blocks, chains, trust, and emotions so you forget to ask how the means of storing information makes any difference in the real world.

          • @BrickedKeyboard
            link
            -21 year ago

            I’m old enough to remember this same line of argument about internet company hype. That everyone wanted a company as successful as Microsoft or yahoo and was throwing money into anything that had a .com. Of course, one of those was Amazon.com

            It’s possible for one field to be 100% a scam (blockchain, NFTs), while another field is 99% a scam (AI startups), and yet the 1% ends up creating a massive new sector of the economy that is richer than anything prior.

            • @selfMA
              link
              41 year ago

              It’s possible for one field to be 100% a scam (blockchain, NFTs), while another field is 99% a scam (AI startups), and yet the 1% ends up creating a massive new sector of the economy that is richer than anything prior.

              again, it’s really weird that you seem to also be anti-cryptocurrency while not knowing that this is the exact reasoning used by some of the most self-harming problem gamblers in the cryptocurrency space — that all it’ll take is one unicorn to make everyone who bought in rich

              • @froztbyte
                link
                31 year ago

                walking example of the lottery gambler’s paradox - statistically you know it is vanishingly unlikely that you’ll be winner, but maybe it could be you!! because our irrational squishy brainmeats are so goddamned easy to logic-hack, so the cycle continues

            • David GerardA
              link
              41 year ago

              I’m old enough to remember this same line of argument about internet company hype

              you’re extremely close to doing your dash on this fine server, cut this shit right now

            • @selfMA
              link
              21 year ago

              also, honest question: have you previously lost a whole lot of money gambling on cryptocurrency?

    • @TerribleMachinesOP
      link
      41 year ago

      Agreed, ChatGPT is nearly useless here compared to Whisper AI. Speech to text isn’t new, but my experience is that Whisper AI is much better than any other speech to text I’ve used.

      One benefit of this approach is that ChatGPT can also produce summaries which can help with early draft iteration or organising unstructured thoughts.

      • David GerardA
        link
        61 year ago

        better speech to text is a good thing

        I have done interviews using YouTube’s auto-transcribe on reasonably clear material and it always needs a great deal of cleanup

        • @TerribleMachinesOP
          link
          31 year ago

          Yeah, if anything cleaning up speech to text (and probably character recognition too) is the natural use of (these kind of) LLMs as they pretty much just guess what words should be there based on the others. They still struggle with recognising words when the surrounding words don’t give enough context clues, but we can’t have everything!

          (Well until the machine gods get here /s 🙄)

          They’re also (annecdotally) pretty good at returning the wording of “common” famous quotes if you can describe the content of the quote in other words and I can’t think of other tools that do that quite so well. I just wish people would stop using them to write content for them: recently I was recruiting for a new staff member for my team and someone used ChatGPT to write their application. In what world they thought statisticians wouldn’t see right through that I don’t know 😆