Michael joined our Data Science bootcamp in October 2020, for an intensive, but rewarding adventure. Before Le Wagon for the past five years, Michael has been a Climate Finance Researcher. Let's discover Michael's Data Science group project.
Summary
Over the past week, we’ve capped off Le Wagon’s bootcamp with the crucially important career week — polishing our professional profiles, planning our next steps, and absorbing much sage advice from tech business leaders and recruitment insiders. But the development of our new data science skillset came to a climax the previous week, as we delivered the projects that we had developed from concepts to viable Machine Learning and Deep Learning apps in less than two weeks. My team developed PlantBase, a plant identification app using Deep Learning to identify plants based on photographs of their flowers (you can find PlantBase on GitHub here). This required Convolutional Neural Networks (CNNs), which we implemented through TensorFlow, with our code written in Python. As expected, the VGG16 model performed best (we also experimented with alternatives including EfficientNet, ResNet, and simpler CNNs built up from their component layers). After a lot of revision and parameter adjustment, we achieved on a model that could differentiate 16 genera of plants using pictures of their flowers, predicting the result with 61% accuracy or predicting the top three results with 86% accuracy.We integrated plant care information, scraped from the Royal Horticultural Society website using BeautifulSoup — this information is shown to the app user when they confirm that their plant has been predicted correctly. We also used the MetaWeather API (application programming interface) to integrate a 5 day weather forecast for London (where the team was based), which is updates whenever the app is used and includes warnings for extreme weather (e.g. frost days, heat waves, gale force winds, or heavy rain). Our team of four completed this app within ten working days, and while we created a viable concept, a lot could be improved with more time. With more experimentation — and potentially more computing power — we could extend our scope to a wider range of plant genera, as well as classifying plants more specifically. One way to achieve this may be to nest multiple layers of CNN models, triggered by each respective layer of classification. We could also extend the weather API to include other cities and towns selected by the user, or use their device’s location data to automate their weather forecast. The other teams in Le Wagon’s Batch #475 also presented some great projects:
FiveStar explored AirBnB listings in London, and how a property’s attributes can predict a host’s review score.
London Emotions aimed to generate a real time interactive map showing London with emotion ratings of different areas, based on Natural Language Processing (NLP).
Fight against FakeNews used NLP to analyse news stories and determine whether they are real or fake.
Green Mood TrackR used NLP to understand public sentiment toward energy transition through classification of tweets’ polarity in the United States and the United Kingdom.<figure class="attachment attachment--content" data-trix-attachment="{"content":""}" data-trix-content-type="undefined">Discover Michael's Data Science project at 25:30m
My experience at Le Wagon has exceeded my high expectations, and while the whole course deepened my passion for data science, the project weeks really inspired me to build exciting tech — watch this space for more.