Wednesday, June 27, 2018

Data Science at Toyota Connected

https://towardsdatascience.com/data-science-at-toyota-connected-69bf50982b09

Presented at Data Science Salon in Dallas by Brian Kursar, Vice President and Chief Data Scientist at Toyota Connected.
Data is everywhere: in our digital footprint, in our financial system, and in our cities, down to the very cars we drive. Vehicles have become increasingly smart, and are now rich sources of data from which to derive valuable insights about customer behavior. We were thrilled to be joined by Brian Kursar, Vice President and Chief Data Scientist at Toyota Connected at the Data Science Salon in Dallas. He imagines a future where cars continue to add value to their drivers far after they leave the dealership. Here’s the transcript, from his fascinating talk.
I’m very excited to talk to everyone about Toyota Connected! First off let me just ask who here has heard of Toyota Connected. No, not Toyota — Toyota Connected. Wow, very very cool! For those of you who don’t know, Toyota Connected is a brand new company. We are about three minutes away from the corporate office. And we are a for-profit company. What we do is we are the arm for data science and data engineering for Toyota. We are a start-up and a start-up in the sense that we truly are a startup — we have about 200 engineers, we work in a different building, our culture is completely different than Toyota Motors in North America but we are powered by Toyota. We have a lot of the backing by the parent company and that really allows us to do a lot of things [that are] very innovative [in] a very different type of culture where we’re empowering our team with the ability to make decisions and to follow through in those decisions. As I mentioned, it’s a completely different office and if you walk into our office you will notice we have a dog-friendly policy and we have free lunches for everyone. As a matter of fact one of my favorite things about Toyota Connected is they actually label the vegan soups which is a something that makes me very happy.
Let me talk a little bit about where we see things and where Toyota Connected fits in. We see that the car is really an essential piece of the internet of everything. You really start out with the Toyota Connected car. And what is that? Behind the scenes every Toyota Connected car coming out since July 2017 is able to transmit sensors of data representing various things such as whether or not the windows are down, GPS, speedometer, odometer, steering angle. But all of these we do only with the consent of the customer. These vehicles actually are dead when you go to the dealership. But we will actually walk you through some of the various use cases for our safety connect program and this is what I’ll be talking about today. However we only enable it once the customer understands what data we’re collecting and how we’re using that data to create new and exciting services for them. The average customer drives about 48 minutes per day — about 500+ unique data points are generated every 200 milliseconds and that really comes out to about 7.2 million data points per connected vehicle per day [that’s] A LOT OF DATA! Petabytes of data! What do we do with it? First and foremost as I mentioned earlier [we] write data services that drive customer satisfaction, we’re looking to create new and exciting services that make driving safer, more convenient, and fun. Next we’ll use that data to really derive new insights to make our products better.
In the very short time that Toyota Connected has been around, for two years now — we’ve had a number of milestones. In April 2017, we worked with the folks in Tour to connect [with] Japan on a project called Japan Taxi which I’ll be talking about in a moment. The connected car went live in July of 2017 and that was for the model year Camry 2018. We then looked at actually using that data in what we call a car share pilot on the Island of Hawaii with a company that does distribution for our vehicles there. And then finally we went live with the car share pilot we now called Hui as well as going forward with Avis to be able to connect the vehicles in their rental problem transactions. Japan Taxi was our first really deep dive into the connected car because this was done before vehicles in the U.S. were connected. Actually this was done before vehicles in the US or Japan were connected. For this pilot we teamed up with TRI. For those who don’t know, TRI is the research arm of Toyota that focuses on the autonomous vehicle. This was an opportunity to really work with them to collect data and provide them data from actual people driving taxis in Japan. With this project, we used special aftermarket devices for eight hours a day every day — and actually it’s still going now — we are collecting the data from these trips. If you look here, this is actually a really quick video of our application that we created. In this application, each dot represents a vehicle and all this is collected in real time — you can actually drill in to one of the vehicles and then see the vehicles driving. This one here is driving late at night (our morning, night in Japan) on the streets.
What do we use this for? This is actually what we’re doing: leveraging machine learning, optic recognition and then providing that and those videos to Tour, the research institute, to be able to take what they’re finding and create their own algorithms to improve the autonomous vehicle. Another thing we do is outside of research, we look at new ways to provide new services for our customers. One of the things is we’re developing a driver score that’s gonna be live probably in the next four to six months. Actually Demuth’s working on it — he’s sitting in the back there and he can talk to you more about that if you’re interested. Here’s what a driving score is: we have a set of metrics or rules that Nitsa provides and to enable us to take what we call CANbus data or data that’s coming out of the vehicle and derive insights and scores on different types of events. Primarily we’re looking at longitudinal and lateral g-force, you’re looking at the location, speed, and then how much you’re applying on brake pressure. To give you an example [let’s] really drill down into four trips. The Green will represent what we call smooth driving. I mean not going past the speed limit, you’re not doing the harsh braking, you’re not having any harsh right turns or left turns or over speeding. The red there is what we would call harsh braking. Then you got that maroon which I can’t don’t think you see well in here, which represents over speeding. Above the line there you’ve got the horsepower acceleration, and here I don’t think we have any hard left turns. What does this look like? They’re actually drilling down even further, so here’s one of those trips, and as you can see here the green just drills down. The green shows that the person currently driving here is driving 37.96 miles per hour, the location at that point is 40 miles per hour therefore it’s green. There’s no harsh braking, there’s no longitude or added lateral g-force popping out, and as you can see for the most part this person is driving smooth. Just about 15 seconds later you can actually see that this person is now speeding. The speed limit there is 30 miles per hour and this person is going 49.56 miles per hour. Keep in mind every one of these dots is a single second. Four seconds later this customer actually hits the brakes really hard, and now you’re able to see that red line or the red circle which shows harsh braking.
That’s nice information — what are we doing with that? We’re able to provide the customer with driving tips. These are tips that will help them understand their overall trip score, what their acceleration is, what their speed is. We are able to now take these tips and show them how they can get better miles per gallon. We’re also able to allow them to say, this is my data and I’m going to actually do something with it. They can take this data and they can send it out to a another service called Toyota Information Insurance Management Services. If they are a good driver, they can send this data to different insurance companies and have them bid for discounts for that customer. US models starting with Camry 2018 which came out in July have these sensors. Now these sensors are actually dead at the dealership — you have to actually go through a walkthrough where the dealer talks to you about what the data is collected how it’s being collected and then you can actually opt-in. We don’t we don’t collect the data unless you opt-in, but it’s for the most recent Camry, the upcoming rav4, upcoming Avalon, and I think there’s a handful of Lexus vehicles as well. Fleet is one of our number one customers because the fleet customers want to know things that are a lot more detailed in terms of vehicle location. They want to understand such things as, are the windows open or closed, has the car been in a collision, what is the fuel level.
A lot of things that I mentioned earlier with the Avis project that’s actually gone live for a fleet to be able to understand the health of the vehicle and to cut down on the time that they’re spending on the checkout process. It’s not in blockchain but we do have a data store that the data is being saved in, yes. Question: hey why do you call it a G Force because what you’re really measuring is stress, strain, and shear forces? G forces are typically used as a nomenclature when you have a force large enough to make it feel like a percentage of gravity — at least half a G. Right, so the transfer of the G Force is from kinetic energy to potential energy. That’s how we are able to understand our harsh cornering, harsh braking as well. It’s not just the brake pressure at all, true, but I think that’s the way that we look at it. There is a guideline and they actually consider it as g-force. You can actually get it after the trip is completed, we do this at a trip level. We do have the potential to provide it in real time but we actually only provide that at the end of the trip for at least the next version that will be coming out.
The next type of service that we provide is what we call collision detection. Very similarly we’re looking at the longitudinal lateral g-force, the acceleration, brake pressure. Here I’d like to say that you know the data really tells a story and this is just an example — where we have the longitudinal, lateral, and vertical g-force; so that’s the X, Y, & Z axis here. And here you can see the acceleration — a red line here and so as the customer is driving and they accelerate you can actually see that in that line coming down here and it goes to about there as they stop. Now the moment you see that transfer of energy — you can see that it’s right about here — and that’s actually what we’re able to use to understand collision notifications. The problem is that — this person was lucky because the airbag was triggered — the way the sensors are set up on the vehicle as well as how we are measuring that, we were able to understand front collisions and we’ve been doing that for many, many years. But the problem is that the airbag does not go off when you have a rear-end collision or side collision or if you were to flip over and find yourself in the bottom of a ditch. And so because of that we’ve realized that we have to now really start looking at the data a little bit differently and understand that some of the airbag sensors that yes, they do trigger a notification and they will call the paramedics — these are things that are not enough to ensure the safety of our customers.
What we’ve been doing now, is looking at classifying crashes into three different buckets, and then also looking at how do we eliminate some of the false positives. For instance, this is the area right here where you’re going between five and eight point five in the magnitude of the g-force and that’s where we traditionally will have airbags deployed. If you then go and look at areas that are not being looked at — these are things that we mentioned, you know the high to medium/low speed crashes where it is a side impact, where it is a rear impact, where the cars flipped over. Now those are the areas where we need to be able to understand but also eliminate the false positive. Harsh braking and harsh cornering are absolutely false positives. What about other things like hitting a shopping cart? Well that’s not so bad. What about going over a speed bump? Well that’s also not a collision. We teamed up with a number of companies to pull in data so that we can actually compare some of the things like video data, data where they’ve actually done crash tests — and then pass those through our models to be able to derive kind of what we call the area of opportunity or the areas where we can now provide newer and better services than were available before.
Different drivers who drive in the same car have different styles — like my wife and I, we share one of those Camrys. In the future though I think we will have head units [that] are very much like Netflix, where you can actually use a profile and based on that profile it’ll be able to keep your settings from a temperature perspective, what radio stations you listen to, what are the places that you like to eat, or do recommendations. However, we don’t have that today in our vehicles.
Today we do use cloud, and we only use cloud for what we’re doing today in terms of metadata management. When this application was created, we had specifications coming from our product engineers that really defined all the data. It’s telemetry data, it’s all structured, we have data dictionaries on everything to help the data scientists understand what it is. One of the things that we are doing as a new company, we are really starting on our journey to data science. I was actually hired to build out a data science practice for the company. We pulled all of our data scientists to gather and really talk about the things that are possible with this data — we d this on regular basis.
There’s a lot of things that we see from a services perspective that we can provide to the customer. In fact, we only do this to provide services for the customer. There’s no reason to use the data unless we can make our products better and we can provide these new types of services for our customers. For instance, in this case we envision that when our collision notification is ready to be deployed — this is something that we’ll be calling you over the telephone and having someone saying “hey we noticed that there’s been a collision, are you okay?” — if someone doesn’t answer we dispatch a unit to that location. Or being able to essentially have a notification pop-up on their phone, saying do you need to send assistance? There’s going to be low impact use cases where someone got rear-ended, we still want to be there for our customer. I think that there was a statistic that one of our data scientists provided me, which was, one out of every ten collisions that ends in a fatality could have been prevented if we had the ability to get someone dispatched quicker. For me, that’s something that I’m definitely looking towards seeing what we can do on our side to make a difference there and to make our vehicles safer.
We’ve talked a lot about what we do as a company. From a company culture perspective, we are absolutely focused on bringing in the best talent. We are really connected and are committed to helping folks understand what we are all about, what we do and entice anyone that might be interested to work for us. We have a number of open positions and please come see me if you’re interested :).
I bought a Toyota Camry mostly because I wanted to understand the full experience, what is it that our dealerships are truly saying to the customer and just so I could understand and be able to talk about it as a Toyota customer. I was actually very impressed with the way they walked me through everything and how they showed us these are things that you can do to enable these services and this is the data we collect and the reality is that when my wife is driving the car of course I’m gonna opt-in. Why? Because I want safety connect. Why? Because if she’s ever in an accident they’re there for her. These are services that I think people are going to opt-in for because they provide a genuine value add to the vehicle. I bought the vehicle also because it has what they call smart sense to be able to understand if someone’s to the right or left to me before I’m making a lane change. All of this really comes down to safety features. From a scale perspective we definitely recognize that the cost to be able to provide these services is something that we’re grappling with. We’re absolutely looking at ways to optimize algorithms — how we’re storing the data, when how much we’re storing, to really only hold the value add attributes to be able to do these algorithms and provide these services. Thank you very much!
Join us at the next DSS near you:
We’ve got a lineup of equally impressive speakers from companies like Viacom, Netflix, Buzzfeed, Forbes, Verizon, Nielsen, Comcast, Bloomberg, Uber, Google and many, many more.


No comments:

Post a Comment