Yes, here is yet another article on AI

2023-04-05

Oskar Trautmann, Wilhelm Rinke

Sure, you don’t need to read yet another article on AI. But this one might just surprise you…

You may have had an experience like this: you open LinkedIn and the first 13 posts are all about artificial intelligence. So, why should you read another post about AI?

You probably don’t have to, but here’s what you’ll get if you do:

We will focus on the terms generative and general AI, the current hype, and speculate about possible futures. We will use the example of generative (text-to-)video AI because we believe it is still pre-factual.

In case you would rather not read the article, here is an AI-powered video that does it for you.

The video is AI-generated in d-id, the pictures were generated in Dalle2 and Midjourney V4.

The pictures show us, Oskar and Wilhelm, the authors – or, at least, they show how we could look, using the descriptive prompts:

Portrait photo of a hip male natural red hair white t-shirt in a black room for Wilhelm and portrait photo hip male natural brown curly hair brown beard glasses for Oskar.

As you can see, we are not quite there yet. It’s still a bit shitty for everyday work. But with new and better tools appearing literally every day, we’re sure to get there soon.

Before we get lost in the hype, though, let’s take a step back and look at the status quo.

Where are we at?

The current state of AI is already proving to be more powerful and productive than the humans it could replace in many ways. Large language models (LLMs) and other AI tools can create workshop concepts on the fly, program apps and websites in seconds, and even pass the notoriously challenging bar exam, with a score in the top 10 per cent.

But the little Gallic village of text-video AI is still holding out against the might of current AI developments. We have collected a few examples of what AI is already capable of in the form of text-to-video and video AI. They show that there has been a lot of progress. But they also demonstrate that we still have a long way to go before we can create our own videos or movies from scratch.

AInchorman could be the new normal.

In China, the state news agency has just unveiled an AI-powered anchor at the country’s World Internet Conference. Could this be the future of how we get our daily news?

AI transforming the arts.

Video as an art form, and music videos in particular, are currently very expensive and time-consuming endeavours. They are produced by a variety of creatives working together to visualise a song.

AI artist Ümüt Yildiz and his team have already created two fully AI-generated music videos for German rap/pop artist Cro, giving us a glimpse into the future of a symbiosis between (video) AI and music.

Are we there yet? (Runway)

An important development in this regard has been announced by RunwayML. They host a platform for artists to use machine learning tools without any coding experience, for media ranging from video to audio to text. If their text-to-video showcases can be trusted, endless possibilities could be just around the corner.

What about ChatGPT?

To address the elephant in the room, yes, ChatGPT will probably include features like text-to-video in the future. At the moment, only text-to-image and vice versa are planned for a future rollout, but given the sheer endless possibilities of the tool, the feature will inevitably be available as part of the OpenAI’s offering.

We are not there yet

No matter which application or field of use we look at, we still need to distinguish between General AI and its subset, Generative AI.

Trained to act human

Generative AI is the technology behind much of the current hype. It is an AI trained on large datasets from which it creates new content from inputs such as text & code, images & graphics, video, audio & speech, and any other format you can think of.

Becoming human?

Artificial General Intelligence (AGI), on the other hand, is the representation of generalised human cognitive abilities in software. When facing an unfamiliar task, the AGI system can find a solution without being programmed for it. The intention of an AGI system is to perform any task that a human is capable of.

That can sound rather scary, but our colleague Shubhashis Sengupta puts the current state of AI and its scope into perspective:

“I can teach AI models to paint like Van Gogh, but it can’t be Van Gogh – it can’t be an artist because it can’t use imagination. One thing is deductive reasoning, and another thing is abstract reasoning. Combining them would result in symbiotic programming, combining the data part with the sixth sense to make sense of it, but we are not there yet!”

Even though the current rate at which new applications that seem to change our whole way of working are popping up, General Artificial Intelligence is still a long way off. And until then, creativity and our human input will be our strongest tools to create value in an ever-changing and adapting world.

To put the current AI craze into perspective, we need only look back five years. Some may remember that we have already been through a complete AI hype cycle. So, an important thing to remember is that we could all be caught up in an emotional feedback loop again. This behaviour is very human.

From the first documented case, the Dutch tulip mania, to the more recent internet crisis and crypto winter, the AI hype is not a new behaviour pattern after all. It could once again fail to live up to our immense expectations. Current investment patterns could be a sign, as it was the AI craze 5 years ago. Back then, many companies turned out to be exaggerating the actual AI capabilities of their products. Not everything could be as fluffy and shiny as the images that AIs like Midjouney pump out every nanosecond.

Where are we heading?

Well, here we are. And for sure, AI will shape the future, maybe even sooner than we expected. But it’s worth remembering that the future is never driven by technology alone.

Artificial intelligence is connected to today’s issues and realities. Trends from society, the economy, or politics will always interplay with emerging technologies. AI won’t be used just for the sake of using AI. As a tool, it will be used in certain contexts by people with specific intentions. Which means that in the end, it is up to us where the journey is going.

Or in other words:

In skilled hands, it’s just a beginning.
– Space 10

So, where are we heading? Regarding video AI, two opposing scenarios instantly pop up in our minds. They are pure speculation and could be described as: “Endless generic content vs. Unleashed creativity”.

Endless generic content

This scenario has a dystopian feel to it. Endless generic content creates a generic world of generic information. AI-generated stock footage could fuel misinformation and deep fakes – resulting in serious media trust issues, and leading to an increasing need for real-life human experiences.

Unleashed creativity

This scenario paints a brighter future because the realisation of any idea is at our fingertips. With text to video, for example, we can generate meaningful content in no time. We will leave the burden of technical skills behind and can focus on our pure creative spirit.

As always, we will probably find ourselves somewhere in between. Let’s wait, watch and play around. Oh, and by the way, this post wasn’t written by ChatGPT ;-)