With Love, A.I: Self-Driving Cars (Part 3 of 3)

“To self-drive or not to self-drive is not the question” __Madhav SBSS

Sensors play a key role in self-driving cars. Radars are widely used as they are good in different weather and lighting conditions and they are cheap. Lidars are high density sensors but are not cheap and are not good in poor lighting and weather conditions. Cameras are cheap, high density but don’t perform well under poor lighting or weather conditions. Our eyes work in a similar way to the camera, it might be the right device here in mimicking human visual perception.

Here is a visualization of where these sensors are being used today in developing self-driving car technology –

Source: selfdrivingcars.mit.edu

Here is a visual spider web representation of the effectiveness of each type of sensor –

Source: selfdrivingcars.mit.edu

Is driver state detection important in self-driving cars?, isn’t the driver supposed to be free to do whatever they want to do, as if they are sitting on a sofa at home?. Here is a chart from Lex Fridman’s talk on what to track and tracking difficult (increasing) going left to right.

Driver State Detection, source: selfdrivingcars.mit.edu
Challenges at the fork of Full vs Human-centered autonomy
Source: selfdrivingcars.mit.edu

As you can see in the above picture, there are still many areas in red (hard to solve), which means there are plenty of opportunities to make a difference in the field of self-driving cars. Be part of the dream that brings safety and access to humans trying to get from point A to point B.

Here are some of the companies that are worth following if you are interested in self-driving cars, technologies and most importantly the people driving the technology forward.

Companies working on fully autonomous self-driving tech –

Companies working on Human-centered autonomous self-driving tech –

Here are a few resources to further explore self-driving technologies –


With Love, A.I: Self Driving Cars (Part 1 of 3)

“Google is working on self-driving cars and they seem to work. People are so bad at driving cars that computers don’t have to be that good to be much better” __Marc Andreessen

What’s the big deal with self driving cars anyway. Why do we need them, can’t humans do any work anymore, are we so lazy that we just want to be transported from place to place without lifting a finger? If the car drives itself, what are we going to be doing? watch videos? chat? selfies? read a book? get more work done? Well, these are a few questions that popped up in my mind as I visualize a world where people are not driving their cars but cars are driving people to their destinations.

I was impressed by this image Lex Fridman presented in his MIT Self-driving lecture series on what’s the BFD with self-driving cars

source: MIT Self-Driving Cars Lecture

That’s not the point of this note though, I’d like to explore how Artificial Intelligence is helping cars drive themselves, what are some of the open challenges and how might A.I help solve these problems in the future? How might autonomous cars reduce accidents and give people who can’t drive an option to “drive”.

How AV technology works

There are 3 key things that make a self-driving possible –

  1. Sensors
    1. LIDAR (LIght Detection And Ranging)
    2. Radar (Radio waves to detect objects, angles, distance etc)
    3. Ultrasonic & Others (Odometer and other “close to vehicle” sensing)
    4. Cameras (To “see” signal lights)
  2. Software (To process all the sensor data)
  3. Connectivity
    1. Internet (to communicate with cloud or other vehicles)
    2. GPS (positioning system so the car knows where it is to the centimeter, which today’s GPS cannot support)

Hear an interesting podcast on the shift to self-driving cars

source: https://www.ucsusa.org/clean-vehicles/how-self-driving-cars-work

Sensors collect millions of data points including objects on the sides, front and back, other moving vehicles nearby.

Software and algorithms process the data collecting through sensors and make decision on acceleration, brake, turns, speed and so on.

Connectivity helps the car “know” road conditions, weather, construction, traffic (is that still going to be an issue? may be not as long as there are no man-made disruptions like construction, drunk person walking across the road, a sleeping cow.

Will self-driving cars look like the cars of today? Perhaps not, there is no need for steering wheel, windows, wipers, mirrors, lights and foot pedals, however, on the flip side, not everything on the car is there only for functionality, some of it is also for esthetic reasons.

So anyway, how does self-driving technology actually work? We will see in Part 2 of this writeup.

With Love, A.I: Transcription (1 of 2)

tran·scrip·tion/ˌ tran(t)ˈskripSH(ə)n/
a written or printed representation of something.

Written representation of audio is normally considered as transcription. How does one go from audio to written word? We could listen to the audio word by word and note down the written representation of each word. This is a manual process and sometimes, we may need to pause the audio to catch up. Some words might be transcribed incorrectly.

Can AI help speed up the process and reduce errors in transcription? That’s a rhetorical question because AI already does this to some extent, we have seen it in products from Apple, Amazon, Google and others.

What would it take for a machine to listen and convert that listening into written word? In its simplest sense, assuming that the machine knows the entire vocabulary of that language in which the audio is in e.g. English, it can compare the spoken word with its vast library of phonemes to figure out what word the audio maps to and “type’ that word in that language into a text editor. Repeating this process for every uttered word recursively will produce a text document that is (hopefully) an exact representation of the audio.

For example, the spoken word “Potato” could be recognized as such by the software that processes each phoneme in the word with the library of phonemes and deconstruct the word to its basic phonemes, then match the possible word with a library of words, take context into consideration and figure out if it the textual representation of the spoken audio is really “Pohtahtoh” or “Pahtayto” or something else.

Apparently, most speech recognition systems use something called Hidden Markov Models.

Specific example of a Markov model for the word POTATO
The more general representation of Markov model. source: wikipedia

Can you implement a speech recognition and transcription system for Telugu language, using off the shelf libraries? This is a question I don’t know the answer to but let’s find out.

I set out looking for speech recognition libraries already available I can leverage and found a few. I don’t know which one is best suited for my purpose. I ‘ll with Google Cloud Speech to Text API as it claims to support 120 languages and Telugu is one of them.

I uploaded a Telugu song clip and Google STT produced the following –

The lines go –

Nee Pada Sevaku Velaye Swami 
Aapada Baapava Aananda Nilaya
Daari Tennu Teliyaga Leni 
Daasula Brovaga Vegame Raava

Google transcribed that to –

Pada Sevaku Velaye Swami, Nee Pada Seva Vela
Teliyagaane Leni Naa Manasulo

What just happened. Why did Google transcription not work? In fact, it is so far off, the transcribed text reads like gibberish.

It’s possible the audio was not of great quality. It’s also possible that the Telugu vocabulary universe of Google Speech-to-Text System (GSTT) is limited. Perhaps the words Aapada, Baapava, Aananda, Nilaya, Daari, Tennu and others are not transcribed properly because related phonemes are missing from the GSTT.

Can one add phonemes and new words to GSTT to improve its accuracy? Funny thing is, it’s possible to add vocabulary to GSTT, it’s simple but not easy. It requires you to know programming and using Google’s STT Application Programming Interface (API). We will look at how to improve Google’s Speech to Text system by adding to its vocabulary in Part 2!

in AI | 563 Words