RSS
 

Posts Tagged ‘Captions’

Researchers teach an AI to generate logical images based on text captions

30 Sep

The Allen Institute for AI (AI2) created by Paul Allen, best known as co-founder of Microsoft, has published new research on a type of artificial intelligence that is able to generate basic (though obviously nonsensical) images based on a concept presented to the machine as a caption. The technology hints at an evolution in machine learning that may pave the way for smarter, more capable AI.

The research institute’s newly published study, which was recent highlighted by MIT, builds upon the technology demonstrated by OpenAI with its GPT-3 system. With GPT-3, the machine learning algorithm was trained using vast amounts of text-based data, something that itself builds upon the masking technique introduced by Google’s BERT.

Put simply, BERT’s masking technique trains machine learning algorithms by presenting natural language sentences that have a word missing, thus requiring the machine to replace the word. Training the AI in this way teaches it to recognize language patterns and word usage, the result being a machine that can fairly effectively understand natural language and interpret its meaning.

Building upon this, the training evolved to include an image with a caption that has a missing word, such as an image of an animal with a caption describing the animal and the environment — only the word for the animal was missing, forcing the AI to figure out the right answer based on the sentence and related image. This taught the machine to recognize the patterns in how visual content related to the words in the captions.

This is where the AI2 research comes in, with the study posing the question: ‘Do vision-and-language BERT models know how to paint?

Experts with the research institute build upon the visual-text technique described above to teach AI how to generate images based on its understanding of text captions. To make this possible, the researchers introduced a twist on the masking technique, this time masking certain parts of images paired with captions to train a model called X-LXMERT, an extension of the LXMERT model family that uses multiple encoders to learn connections between language and visual data.

The researchers explain in the study [PDF]:

Interestingly, our analysis leads us to the conclusion that LXMERT in its current form does not possess the ability to paint – it produces images that have little resemblance to natural images …

We introduce X-LXMERT that builds upon LXMERT and enables it to effectively perform discriminative as well as generative tasks … When coupled with our proposed image generator, X-LXMERT is able to generate rich imagery that is semantically consistent with the input captions. Importantly, X-LXMERT’s image generation capabilities rival state-of-the-art image generation models (designed only for generation), while its question-answering capabilities show little degradation compared to LXMERT.

By adding the visual masking technique, the machine had to learn to predict what parts of the images were masked based on the captions, slowly teaching the machine to understand the logical and conceptual framework of the visual world in addition to connecting visual data with language. For example, a clock tower located in a town is likely surrounded by smaller buildings, something a human can infer based on the text description.

An AI-generated image based on the caption, ‘A large painted clock tower in the middle of town.’

Using this visual masking technique, the AI2 researchers were able to impart the same general understanding to a machine given the caption, ‘A large clock tower in the middle of a town.’ Though the resulting image (above) isn’t realistic and wouldn’t be mistaken for an actual photo, it does demonstrate the machine’s general understanding of the meaning of the phrase and the type of elements that may be found in a real-world clocktower setting.

The images demonstrate the machine’s ability to understand both the visual world and written text and to make logical assumptions based on the limited data provided. This mirrors the way a human understands the world and written text describing it.

For example, a human, when given a caption, could sketch a concept drawing that presents a logical interpretation of how the captioned scene may look in the real world, such as computer monitors likely sitting on a desk, a skier likely being on snow and bicycles likely being located on pavement.

This development in AI research represents a type of simple, child-like abstract thinking that hints at a future in which machines may be capable of far more sophisticated understandings of the world and, perhaps, any other concepts they are trained to understand as related to each other. The next step in this evolution is likely an improved ability to generate images, resulting in more realistic content.

Using artificial intelligence to generate photo-realistic images is already a thing, though generating highly specific photo-realistic images based on a text description is, as shown above, still a work in progress. Machine learning technology has also been used to demonstrate other potential applications for AI, such as a study Google published last month that demonstrates using crowdsourced 2D images to generate high-quality 3D models of popular structures.

Articles: Digital Photography Review (dpreview.com)

 
Comments Off on Researchers teach an AI to generate logical images based on text captions

Posted in Uncategorized

 

Instagram expands anti-bullying system, starts issuing alerts over offensive captions

17 Dec

Instagram has announced that its platform will start warning users when it detects that they’re about to post a potentially offensive caption on a photo or video. This new feature marks the expansion of the anti-bullying system Instagram introduced earlier this year.

In July, Instagram rolled out an AI-powered system that warns users when they attempt to publish a ‘harmful’ comment. This same technology is now being used to monitor for potentially offensive content captions, as well, Instagram announced on Monday.

The system works by identifying captions that are similar to ones previously reported by users. When the system is triggered, a prompt will appear within the Instagram app that reads, ‘This caption looks similar to others that have been reported.’ Users have the option of either sharing the caption regardless or editing it before publishing.

The feature is rolling out to ‘select’ countries at this time, but will be available globally in ‘coming months.’

Articles: Digital Photography Review (dpreview.com)

 
Comments Off on Instagram expands anti-bullying system, starts issuing alerts over offensive captions

Posted in Uncategorized

 

Trint’s AI-powered plug-in automatically creates captions for Premiere Pro CC

13 May

We’re a little late to this one, but it’s an interesting option for video editors that is worth sharing all the same. Last month, transcription company Trint launched a new Adobe Premiere Pro CC plug-in that uses artificial intelligence to automatically create captions in Adobe Premiere Pro. Called Trint for Premiere, the new plug-in allows Premiere Pro CC users to upload videos to Trint’s system directly from Adobe’s application. Trint’s speech-to-text tech then automatically transcribes the audio and generates captions.

According to Trint, its system creates a draft transcript that the user refines in the Trint Editor. These corrected transcriptions are made available in the plug-in’s panel, providing direct SRT access within Premiere. The software also supports Edit Decision Lists (EDLs), simplifying soundbite creation.

Trint’s new plug-in is available from Adobe Exchange with a free trial. After the trial period ends, users have the option of paying a fee per-hour of transcribed video, or signing up for a monthly subscription. The free trial signup and more information is available on Trint’s website.

Press Release

Trint’s AI transcription software to add integrated panel for Adobe Premiere Pro CC in Enterprise Offering

London (April 4, 2018) – Customers of Adobe® Premiere® Pro CC, part of the Adobe Creative Cloud®, will now have the opportunity to make their workflow even more seamless using Trint’s automatic speech-to-text technologies.

Trint for Premiere, Trint’s new panel for Adobe Premiere Pro CC enables direct upload of footage to Trint for fast automated transcription. This gives users a quick and integrated flow for making audio searchable and for creating captions for their media. Using Trint’s automated speech-to-text technologies, editors quickly get a machine-generated draft transcript that they can easily polish to perfect in the Trint Editor. Transcriptions corrected in Trint will be available in the panel, giving editors direct access to SRTs from within Premiere Pro CC.

Highlighted selections of Trint transcripts will be available as EDLs, making it fast and simple to find key soundbites and the editing process much quicker for users. EDLs for videos can be exported from the panel and can be seamlessly used in the Timeline.

“Our customers have been asking us for an Adobe Premiere Pro CC integration for some time,” says Jeff Kofman, CEO of Trint. “We know that many of our users are video editors, and it’s exciting to be able to streamline their captioning and editing workflows.”

Building on its strong transcription toolkit, the company is also releasing additional enterprise features which will make it the leading transcription service for organizations.

Trint has also released the new Trint mobile iOS app that lets users record interviews and meetings on their iPhone and upload them to Trint for fast transcription and collaboration from anywhere.

Articles: Digital Photography Review (dpreview.com)

 
Comments Off on Trint’s AI-powered plug-in automatically creates captions for Premiere Pro CC

Posted in Uncategorized

 

How to Use the Right Captions on Your Photos to Better Connect With Viewers

03 Jun

30 years ago, we used slides, prints and albums to share photos with family and friends.  Now, between Facebook, Instagram, Flickr, Google+ and 500px, you have more options than ever to share your photos.  The problem is, how do you connect to this much larger audience?

Sharing a story alongside your photo will help you connect with your followers, and often turn a great photo into something spectacular.

When you share a photo, people may wonder where it was taken, why you were there, what made you photograph the scene, or what was going through your head the moment you snapped the shutter. These are all questions that can be spun into a narrative and shared along with your photo.

The right caption draws viewers into the image

Here’s an example. Which of the following captions draws you in and makes the photo more interesting for you?

Image1

Caption 1: Kayakers on the Hudson River

Caption 2: Springtime in upstate New York is full of variable weather. The changing temperatures coupled with different types of precipitation can make for beautiful and unpredictable landscapes. On this particular morning, the Hudson River was covered in a thick fog and knowing how fleeting that can be, I hurried down to the waterfront hoping to capture some shots. Out of nowhere, two colorful kayakers appeared, adding life to my scene as they cut their way down the river and disappeared into the abyss.

I may be biased, but for me it’s Caption 2. Seeing a beautiful photo with a story attached to it pulls me in. It puts me in the same space that the photographer was in when they took the photo, enriches my experience, and ultimately makes the photo, which was good in the first place, a great one.

If you went to a yard sale and and saw a beautiful glass bowl for $ 20 you may think, “Well, that’s a bit steep for a simple bowl at a yard sale.” But, I bet your mindset would change if the owner told you a story about the bowl — how she acquired it at a glass blowing factory in Halifax back in the 1950s, how it was one of just a handful made and how the bowl moved around the United States with her and family for the past 60 years. Now $ 20 seems like a bargain!

Nothing changed, you just got some more information. A story enriched your understanding, and in turn, completely changed how you experienced something.

Here’s another example:

Image2

Caption 1: The Mohonk Mountain House after an ice storm

Caption 2: It was early December and an ice storm had just ripped through the Hudson Valley leaving debris, destruction, and a clear blue sky in its wake. My wife and I began our hike that day at a lower elevation, and realized as we got higher that the entire forest was encased in ice. It was a winter wonderland that was both beautiful and dangerous. Limbs of trees were scattered everywhere, boulders were slick with ice and in some spots, five foot long icicles hung like stalactites above our heads. As we made our way to the top of the mountain, I stepped into a small gazebo overlook and focused on the Mohonk Mountain House and surrounding landscape, letting the icicles in the foreground frame my shot.

There’s nothing wrong with the first caption, but the second caption really paints a picture in the viewer’s mind and places them there with you.

Here’s another shot I took this winter.  In the past I would have shared it with Caption 1 below, but instead I shared it with Caption 2,  and found that it really resonated with my audience.

Image3

Caption 1: Winter Sunset

Caption 2: It was a Friday night and I rushed out of work wanting to photograph something. I made a quick stop at home, put on boots, and grabbed my snowshoes just in case. With so much snow on the ground I racked my brain for a spot that I could easily get to with the potential for a decent sunset shot. Luckily, this incredible vista is just down the street from me. I got there when the sky was beginning to turn all sorts of colors, hurriedly set up my tripod, and captured this winter sunset. I stayed for a little while, watching blues give way to pinks, yellows and oranges until all the color in the sky was gone and my frozen hands signalled to me that it was time to go home.

Viewer experience is enhanced

Not every photo needs a page of text written alongside it, but it’s been my experience that adding a couple of sentences, rather than just a few words (or none at all), greatly enhances the experience of the viewer.  It helps them to connect to your photo and ultimately with you as a photographer.

Image4

Caption 1: The Space Needle in Seattle

Caption 2: After an afternoon touring Seattle and Pike Place Market, my wife and I headed over to the Olympic Sculpture Park but found it was closing just when we arrived. Disappointed that I wasn’t able to capture any images of the park, I turned my camera around towards the city as we left and captured this unique view of the iconic Space Needle.

Summary

When I share a photo, I want people to respond to it. I want them to share in the moment and feel what I was feeling when I took the photo. Your story might seem mundane to you, but to your audience it gives them a closer look at who you are and how you think — as a person and a photographer.

googletag.cmd.push(function() {
tablet_slots.push( googletag.defineSlot( “/1005424/_dPSv4_tab-all-article-bottom_(300×250)”, [300, 250], “pb-ad-78623” ).addService( googletag.pubads() ) ); } );

googletag.cmd.push(function() {
mobile_slots.push( googletag.defineSlot( “/1005424/_dPSv4_mob-all-article-bottom_(300×250)”, [300, 250], “pb-ad-78158” ).addService( googletag.pubads() ) ); } );

The post How to Use the Right Captions on Your Photos to Better Connect With Viewers by Joe Turic appeared first on Digital Photography School.


Digital Photography School

 
Comments Off on How to Use the Right Captions on Your Photos to Better Connect With Viewers

Posted in Photography

 

Creating a Simple Photo Page with Captions

15 Nov

At times you may need to place a set of photos on one page for a photo album or portfolio. Learn how to customize the size and captions for the photos. For more tutorials in higher-quality video, browse our catalog. This is a FREE download.
Video Rating: 5 / 5

 
Comments Off on Creating a Simple Photo Page with Captions

Posted in Retouching in Photoshop