A few months ago, during my last year of engineering studies, I was handed the opportunity to start a small AI driven project. It sounded hugely interesting, so I grabbed the opportunity with both hands. But then the doubt feelings started to rise, I had barely any knowledge about AI at all…
The start
At this point in time, the biology department of Vrije Universiteit Brussel was doing tests with different kind of bacteria in petri dishes (please do not ask me which or why, although they tried to explain it multiple times, I seemed to have not exactly the strongest understanding capabilities in biology). The main idea of the test was to contaminate some petri dish with a bacteria, and observe how fast new cell colonies would form, thus counting the amount of cell colonies in a petri dish.
This was done by hand, counting each colony and putting a blue dot above it so they would remember which was already counted. As you might expect, very time consuming, and not very interesting work. They estimated their own precision to be around 80%, I had a feeling that this could be done better by a machine (things still seemed simple in my head back then).
Getting to know AI
I knew the first thing I had to do was gathering more knowledge about AI, and especially Convolutional Neural Networks, since these are the goto architecture for object detection and image recognition. So I spent a few days watching YouTube videos and reading papers about them, until I felt it was the time to start experimenting with Keras and Tensorflow to get familiar with these frameworks.
Problems arising
At the same time, I started to realise that I needed a lot of data to make this work, like, a LOT more than I expected, and an even bigger contrast to what was available, the latter being none… This seemed a disaster for the project, since I only had a few weeks to finish it, and still a lot of other work to do simultaneously. So I decided to make a small iOS app to allow the biology department to make pictures the same way it would later be done to predict the number of colonies in the dish, this meant, of course, learning iOS app development and Swift. Luckily there is a lot of content out there to make this a quick job.
Moving on with AI
Meanwhile I was starting to get the hang of Keras, and started to have an understanding of what to look for in a model, but there was still a lot more to learn, and with the examples and tests getting exponentially more complex, I started to hit a rather unpleasant wall, namely the processing power of my own computer. I was trying out different things in Keras, with some needing training of up to 2 days, which was starting to become a real pain in the … with all the other stuff for other courses that also had to be done on the same computer, which was clearly unusable during training periods. This is where Brainjar and IBM jumped to the rescue.
Brainjar
Brainjar, a company focussing on artificial intelligence, were so kind to invite me into their office, and give me a space to work on the project, between a whole team of AI specialists. This was a wonderful experience since it not only brought me a lot of knowledge, but also an insight of a professional environment, which is always welcome for a student.
The extra knowledge was awesome, but didn’t fix my computing power deficit… Luckily, Brainjar also brought me into contact with IBM, from which they where using a PowerAI machine. After a quick pitch of the project to IBM, they seemed interested, and allowed me to use the PowerAI to enable me to train and test a bit more than with my own computer.
PowerAI
Clearly I was siked and feeling curious about that supercomputer called PowerAI (the name alone already made me feel like a 5 year old that received a new toy). Once I started testing on it, I was blown away by the power it was packing. Tests taking up multiple days on my MacBook, suddenly only took me a few hours to train on this powerhouse. Keras has built-in support to train on multiple GPU’s, which meant no worries at all to port anything I was doing. This huge power increase meant I quickly got to the point where I decided to actually take on the project itself.
Choosing a model
My first idea was to train a model, and implement it on the edge in the form of a CoreML model in an iOS app. Apple’s coremltools state that Keras models can be converted, so that sounded like a good plan. Which meant I needed a lightweight model (we don’t want to wait too long for a result, and we don’t want to drain a whole battery for every picture we want to analyse). After a bit of looking around, I decided to go for the SqueezeDet architecture, described in the arXiv:1612.01051 paper.
First real tests
Since having a dataset with petri dishes was still a long way down the road, I started my first tests with the KITTI dataset used for the benchmarking of the model. I did a few different training runs, to quickly find out that it reached a pretty decent accuracy after even as few as 10 training runs. And since I was planning on using a different dataset anyway, I usually did trainings of 10–15 runs only. Having the PowerAI meant I could do these in about 1h, literally during lunch…
Porting to CoreML
All was going well at this point in time, too well I guess, because the next phase in the project was one of the more heartbreaking moments I had lately. I got the model trough coremltools, which seemed to tell me all was well, but when I ran an image through the ported model, it seemed that half of the returned feature vector values were empty. I spend the next 2h trying a million different things, supposing I was the one to blame and made a stupid mistake somewhere down the line(somewhere deep down I still think I am the one to blame…). After a while I gave up as I realised that I only had a few days left to find an alternative. I told myself it could be due to the “fire” layers, which split and concatenate the intermediate data in the model, which might not be working so well when porting to CoreML.
What now?
What could I do? Try a different model, like YOLO, which had been proven to work as a CoreML model. Or I could switch things up, and run the model as a cloud service, and have the app send the image to the cloud and receive the result. I chose the latter one, since I am still far from confident with AI. This meant the implementation of the model was also much closer to the one from omni:us Engineering, who made a Keras implementation of the SqueezeDet model, found on their github https://github.com/omni-us/squeezedet-keras.
Quickly I got to the point where I was able to successfully achieve predictions on a server, through a small and simple Flask script to interface with it from anywhere through HTTP.
Disaster
With the technical stuff mostly working, I had 1 week left before the end of the project, so I ran back to our biology department to see if they had any luck with making pictures. But when I arrived, it became clear that the short time in which this all had to happen was not realistic. They had sub-10 petri dishes ready since the start of the project, which would clearly not be enough to train the model. I did some tests with some data augmentation, but nothing got me any kind of decent results.
Unfortunate, but I guess life is not all sunshine and rainbows. And with the deadline being around the corner, I decided to end the official project with only the ability to find cars in an image, which still got me through the class easily, but I felt kind of disappointed by not having a real result yet. Nevertheless, I will not drop this, and I will have contact with the biology department from the moment they have more pictures, to still try and get a real result out of this. When that time hits, I will follow this up with the numbers and some examples!
Conclusion
This project was the whole package. The excitement, the knowledge, the challenge, the struggle, the hype and the disappointment… At the end, I had mostly great fun in learning about AI, working together with Brainjar, and experiencing the joy of working with tools like the IBM PowerAI. In total this project took me about 12 days, with 15 KITTI training runs on the SqueezeDet model taking up to 2 days on my own MacBook, this wouldn’t even have been close to possible without the shear power brought to this project by IBM. And having the guys from Brainjar around me to help me find my rookie mistakes, and assisting me in learning about AI, was a big reason this project got to the point it is, and boy did they make it fun!
So I would love to end this with a big shoutout and thank you to IBM and Brainjar! I hope we meet again soon!