In this article, we learn how to create a system to detect if a sink contains any dishes that need washing up. We will need an ONVIF compatible network camera, an account on Google Cloud Platform to train the models and Gravio Enterprise edition to deploy the models to your edge devices.
Note, you can use Gravio for free with the people counting model, but if you would like to create and deploy your own computer vision models as we do in this tutorial, you will need an Enterprise subscription. You can also use the first part of this tutorial to just learn about how to create models, even for use in other places than Gravio.
In this tutorial we use a camera to detect if there are dishes in the sink. If there are, we can trigger an action playing a “Warning Sign” audio file. Of course you can trigger anything else that has an API or that we have a component for.
We follow these steps
If you like to follow the entire tutorial, it may take multiple days, mainly because you will have to install and leave the system to gather training images. The second big chunk of work is labelling the images to prepare them for training. Depending on how many training images you have, that can take several hours. But apart from those two things, the tutorial can be completed in less than 30 minutes.
One of the first steps to undertake is to start collecting images. And you have to make sure you’re getting good images. So start with a solid camera installation. This is how it looks like for this example:
The camera is connected via a LAN cable in this case, which also provides for the power via PoE (Power over Ethernet).
Once the camera is connected to the same network as Gravio, Gravio will pick it up in the “Cameras” section. On Mac and Linux, the same will happen if you connect an USB camera. We are in the process of adding more camera feed systems. Join our slack channel to be notified once they release:
Enter the username and password you have set for your camera and save your pictures in frequent intervals, depending how quickly the situation changes. Don’t draw any frames.
We have to add the camera to any computer vision layer for now, so we can get it to start copying images. For this, go to the settings, then “Image Inference” and deploy any available model.
This can take a while depending on your internet connection. Once the model is deployed, go back to the “Devices” tab and create an Area (for example “Kitchen”) and add that AI layer to it:
Now add the camera and switch it on by clicking the checkbox.
As soon as you click the checkbox, your camera will start capturing images to your hard drive.
Wait a few hours or even days, depending on how many images you want for training, and open the folders with the images. How many you need depends on how many items you want to detect and how different the images are. We suggest you do a few hundred images, hopefully many of them are slightly different from each other. You will find the images in the ‘mediadata’ folder within the Gravio folder. On Mac, that is under /Library/Application Support/HubKit/mediadata/ , on Windows that’s under C:\ProgramData\HubKit\mediadata\ and on Linux it’s under /var/opt/hubkit/mediadata/ . More information about the file paths can be found in the Gravio Documentation.
The next step is to create an inference model file. Gravio uses TensorFlow, that’s why you can create those models on your Google Cloud Vision account, which is part of the Google Cloud Platform. Training models is a very computer power intensive process, hence Google is charging for the computing power, but often you get some credit to start with and try it out.
To get started, head over to https://cloud.google.com/vision/ and click on the “Get started for free” button.
After your login with your Google account, you will see a screen similar to this:
Follow the sign-up instructions until you reach a screen similar to this:
Here you search for “AutoML” and click on “Vision”:
Which will open this screen:
Gravio supports “Object Detection” so please click on “Get Started”. The first time you do this, you may have to enable the “AutoML API”:
Then you create a new data set:
Give it a sensible name and pick the “Object Detection” at the far right:
By now, you should hopefully have loads of (variants of) images collected from your camera. Gravio will save the pictures in the /mediadata/ folder of your Gravio installation. Just click on the camera device and your images will be split into one folder per day.
Take as many different images as you can and upload them into a Google bucket. In our case here, we created a subfolder and added them there:
Once uploaded, you can click on the “Images” tab and start creating the labels for the items you would like to detect:
Google then also provides an online tool to assign labels to items in the images. Note, this is a very tedious and time consuming task, you may want to consider outsourcing this to companies that specialise in image tagging.
But if you like to do it yourself, this is how it could look like:
You have to do it for a few dozen if not hundreds of images, until Google has enough examples it can split it into three types of images:
The “Validation” and “Test” columns will automatically fill up after a while, just ensure that you have uploaded and tagged enough images. You may have to do it in multiple batches.
That’s it!
Once your images are uploaded and you have enough types of them, you can start the training process. This is where you will incur costs, but the initial free Google credits should be enough for you to give it a first try. Just click on “Start Training”.
It may take a while for the training to finish (multiple hours), and once the training is finished, it’s time for you to download the models Google has created.
Note: If you like to test the models first, you can deploy the model on Google temporarily, and test it by uploading a random test image. Google will then try to detect objects and you will see if it worked:
If your model works well, you can download your model to your computer for the deployment on your Gravio. You will need the “Tensorflow Lite” files for Gravio.
In order to do that you will need to install “gsutil” on your computer. Please follow the instructions for your operating system on the website from Google: https://cloud.google.com/storage/docs/gsutil_install
Once that’s installed and authenticated, you can use the command
gsutil cp -r gs://<name-of-your-bucket> ./<destination_folder>
To download the models that Google has created for you. It will look something like this:
In order to deploy the models to your edge nodes, you will need to have access to an Enterprise edition of Gravio. The Enterprise edition comes with the “Gravio Coordinator” which will host your models. From there you can deploy them to the various edge nodes.
The Gravio Coordinator can be installed on any Linux machine, on-premise or on a (private) cloud. Find out more about installing the Gravio Coordinator here: https://doc.gravio.com/coordinator/2/en/topic/gravio-coordinator
In a nutshell, if you have a raspberry pi, you can install Gravio including the coordinator by:
Congratulations, your Coordinator is now installed and your Edge devices are ready to be connected to the Coordinator.
Now let’s connect your Edge HubKits (edge computing server or servers, if you have multiple) to the Coordinator. First, start Gravio Studio, which you can download from the Apple or Windows App store. Instead of creating an account and logging in, you can now log in with the credentials you have just created on Gravio Coordinator. For this, click on the “Log in” button
And then click on “Log in with Gravio Coordinator”
Here you can enter the IP of your coordinator and log in using the account you have created on your Coordinator:
Once logged in, you can see all Gravio HubKits in your network. Click on the one you would like to connect to the Coordinator, and navigate to the Settings and copy your Hub ID:
Go back to your Coordinator and click on the “Hub” tab and click “Add”:
Note, here I have previously added other devices to this Coordinator. But yours is likely to be empty in the beginning.
You can then enter the details of your HubKit including the ID you have copied earlier:
Set a password for the communication between the HubKit and the Coordinator to authenticate with. If you like, you can also enable Blockchain so this HubKit will become part of a Distributed Ledger Technology system. Click “Save” and enter the same password into the password field of your HubKit:
You can then test the Login by clicking on “Test Login”:
You should see a success message. Your Coordinator is now connected with your Edge server HubKit.
This means, if you deploy a model to your Coordinator, it will be available in your Gravio Hub connected to the same Coordinator.
Now that you have your models downloaded and your Gravio system set up, it’s time to deploy them on the Gravio infrastructure. Start with renaming the files:
Open the dict.txt file and remove the "background". Then add the line numbers to the beginning of each line. For example:
1 bowl
2 cutlery
3 mug
4 plate
The filename remains the same.
On your coordinator, open the “Inference Models” tab and click on “Create” and then “TensorFlow Lite”. Note you can upload a previously exported ZIP file with all the model data but in this case, we create it:
A popup will open and you can select the files we’ve just renamed:
Select “Count” and “JSON” to get the value in JSON in Gravio. Also include DetectionValues and you can set a confidence level with which the model should trigger.
Click on “Create” and your new bespoke Model will appear:
Now, after a restart of the Gravio HubKit and a log out and log in again, this model can be deployed to the Gravio Hub. In Studio, go back to the Settings, and click “Inference Models”:
Once deployed, the model is available for application in the data layer and can be used like a sensor:
Do you remember how we created the Area and Layer in Step 1? Go back to that layer and delete it:
Then re-add that newly available layer along with the camera:
Add the camera again and enable it:
That’s it, if you click on “Data” and enable the “Live” switch, you should see now data coming through depending on what the camera sees:
If there are items detected, you will get labels and coordinates. If you there are none detected, you will get “detections”:[]
Now, let’s trigger an Action if something is detected. First let’s create a random Action. In this case here, we just play a sound:
We trigger the Action that plays an alert sound, if the length of the items are greater than 0.
That’s it, you have now connected your Action with your Trigger, your Trigger with your Camera, and your Camera with your AI model. Happy object detecting!
Creating your own models has become very easy with Gravio. The biggest part of the work is setting up, collecting images and labeling them for training.
But being able to create your own image recognition models is very powerful, especially if you don’t need specialised and expensive gear.
Please note that the quality of your AI detection system will heavily depend on the quality of your training images and your labelling. Machine Learning is never finished. Keep improving your models. And please keep privacy in mind when working with cameras filming people, even if you’re not storing the images in production.
Hero photo credit: Scott Umstattd on Unsplash