AI and OCR: Volunteering.today

April 21, 2018
AI ML C# Hackathon OCR

Volunteering.today

Sirotilja Valley

Recently, my team and I were competing an the ChangeCode hackathon in Zagreb, at the King ICT company.

The task was helping three kinds of people:

We were free to come up with anything we thought would be helpful, and we had 24 hours to do so. We managed to get 10.000 HRK award, and an offer to continue working on the app to implement it completely with the organizers.

Sirotilja Valley

The idea

The main problems we found while researching current solutions for aggregating volunteering events were:

Solution?

We created an AI which searches social networks for volunteering events, and automatically adds them to our website which is a repository of all volunteering events in the state. The AI then notifies the organizers they can log in to the site using their social accounts and access their events and their volunteers. Volunteers also accessed events with a one-click login process. We also sprinkled the app with some features like facial recognition of all of the volunteers that attended a particular event based on event photos, and automatic attendance list generation using the same. Also, every event includes analytics such as how many people are talking about it, sharing it.

How did we do it?

The GitHub repository can be found here.

Since we had only 24 hours to come up with the idea and implement it, we used Twitter as a PoC. The same process can be applied to other social networks.

The process goes something like this:

  1. Scrape Twitter for Tweets containing particular volunteering-related hashtags, such as: #Volunteering, #VolunteeringEvent or any tweets related to a particular cause, such as: #VolunteerForImmigrants. Download such Tweets and any accompanying media.
  2. Identify the most important information from the downloaded tweet, such as the location, date, event name, contact. Use OCR if media is present.
  3. Get Analytics for the Tweet for frontend analytics such as how many people have retweeted it, replied to it, discussing it, etc.
  4. Send the data to the backend API which then generates a new post on the Volunteering.today page.
  5. Tweet to the event organizer that their event was added to our web, and that they can access it and their registered volunteers with a one-click sign-up using their social media account.
  6. Enable Volunteers to register for events with a single click, and generate an attendance list cross-referencing the registered volunteers with faces recognized on pictures of the event uploaded by the event owner.

The GitHub repo contains all of the code, and it’s free to use without any attribution.

Free and open source code rules. ♥

So let’s get to it. Every chapter will contain a hyperlink to our GitHub repo where our implementation can be fully found.

Scraping tweets

Using the TwitterAPI, scraping tweets is pretty simple. We also used the TwitterSearch library to easily parse the API responses.

Example tweet:

The code for scraping it looks something like this:

try:
    tso = TwitterSearchOrder()
    tso.set_keywords(['#VolunteeringHashTags, #AnythingYouWantToSearchFor']) 
    # tso.set_language('de') # if we wanted to search for tweets written in German only
    tso.set_include_entities(True)
    ts = TwitterSearch(
        consumer_key='secret...',
        consumer_secret='secret...',
        access_token='secret...-secret...',
        access_token_secret='secret...')
    
    for tweet in ts.search_tweets_iterable(tso):
        data = {}
        data = generateDataFromTweet(data, tweet)
        return json.dumps(data, ensure_ascii=False)

The generateDataFromTweet function just parses the data from the JSON and saves it to our dictionary.

The code can be found here.

After getting and parsing the tweet, we just write it to a file:

with codecs.open("lib/twitter.json", "w", "utf-8-sig") as datoteka:
            datoteka.write(json)

We then analyze the tweet using the TwitterAPI to get data necessary for frontend analytics generation.

# Get Retweets, Favorites, Location and IDs
def getBasicData(data):
    r = api.request('statuses/show/:%s' % id_tweet)
    for item in r.get_iterator():
        data["RetweetCount"] = item["retweet_count"]
        data["FavoriteCount"] = item["favorite_count"]
        data["IDStatus"] = item["id"]
        data["IDStr"] = item["user"]["id_str"]
        data["Location"] = item["user"]["location"]
    return data

# Get the number of replies
def getReplyCount(data):
    url = "https://api.twitter.com/1.1/statuses/mentions_timeline.json?since_id=" + data["IDStr"]
    r = twitter.get(url)
    reply_count = 0
    for item in r.json():
        if item["in_reply_to_status_id"] == data["IDStatus"]:
            reply_count += 1
    data["ReplyCount"] = reply_count
    return data

OCRApplication

The OCRApplication is a machine-learning backed app which analyzes pictures and extracts any kind of text from them, including hand-written notes or any fonts. It was written in C# and developed using the Microsoft CognitiveServices Vision API.

The complete code can be found on the GitHub repo.

Example image

The code:

The meaty part of the code can be found in the OCRTools class, which contains the methods necessary for image analysis and OCR.

public static async void analyzeImage(string imageFilePath) {
            fileName = Path.GetFileNameWithoutExtension(imageFilePath);
            HttpClient client = new HttpClient();
            string uri = buildHTTPRequest(client);
            HttpResponseMessage response;
            byte[] byteData = getImageAsByteArray(imageFilePath);
            response = await getResponseAndExtractText(client, uri, byteData);
}

public static void extractText() {
            ProcessTools.startProcess("python", @"lib/json_to_text.py lib/" + nazivDatoteke);
}

The analyzeImage method loads an image from a file and sends it to the Vision API, which returns the extracted data in a JSON. We can then call our helper Python script which parses it and saves it to a file.

json_data = open(filePath + '.json', encoding='utf8').read()

data = json.loads(json_data)
data = data['regions']
regionsLength = len(data)

string = ''
# Time-complexity much? :(
for i in range (0, len(data)):
    # --> lines
    for j in range (0, len(data[i]['lines'])):
    # --> words
        for k in range(0, len(data[i]['lines'][j]['words'])):
        # --> text
            string += data[i]['lines'][j]['words'][k]['text'] + ' '
            
with open(name + '.txt', 'wb') as file:
    file.write(string.encode('UTF-8'))

Sending the data to the backend

Nothing much interesting here. The data is sent to the backend using a POST request, which contains the contact info, date, location, event name, event image, etc.

The backend populates the web with new pages generated from data found in the database every few minutes, using ASP.NET.

After the page generation is successful, the backend app triggers a Python TweetBot which notifies the event owner that his event has been added to our event database.

#Message that will be tweeted to the owner
message = "Hi " + quote_plus("@"+username) + ". I just wanted to let you know that your event was added to our volunteering database, where you can log in and manage it, and find volunteers: "
url = "http://volunteering.today"
url = quote_plus(url)

def getTwitterSession():
    return OAuth1Session('secret...',
                            client_secret='secret...',
                            resource_owner_key='secret...-secret...',
                            resource_owner_secret='secret...')

twitter = getTwitterSession()

# Send out the tweet using TwitterApi
r = twitter.post("https://api.twitter.com/1.1/statuses/update.json?status="+message+"%20"+url)
r = twitter.post("https://api.twitter.com/1.1/direct_messages/new.json?text=" + message +"%20"+url + "&user_id=" + userID)

Example tweets to generated events

Example event

Example text detected from tweet media

Facial recognition and attendance generation

Having some time left over, we started implementing the facial recognition features of the app, which used Microsoft CognitiveServices Face API which can recognize up to 64 faces on a single picture, remembering up to 10.000 unique faces in a group (a group would be an event in our app).

Example input photos per user

These are the images a user uploads while registering.

Example event photo

Detected faces

Generated attendance list

The codez

Once again, the entire code can be found on the repo.

The meaty part of the code is the IdentifyFace class.

class IdentifyFace
    {   
        public async static void identifyFace(string base64)
        {
            Stream image = ImageTools.Base64ToImage(base64);
            var faces = await Globals.faceServiceClient.DetectAsync(image);
            var faceIds = faces.Select(face => face.FaceId).ToArray();

            var results = await Globals.faceServiceClient.IdentifyAsync(Globals.personGroupId, faceIds);
            foreach (var identifyResult in results)
            {
                Console.WriteLine("Result of face: {0}", identifyResult.FaceId);
                if (identifyResult.Candidates.Length == 0)
                {
                    Console.WriteLine("No one identified");
                }
                else
                {
                    var candidateId = identifyResult.Candidates[0].PersonId;
                    var person = await Globals.faceServiceClient.GetPersonAsync(Globals.personGroupId, candidateId);
                    Console.WriteLine("Identified as {0}", person.Name);
                }
            }

        }
    }

It uses the Base64 passed as a parameter to it and sends it to the Face API which then returns if the face matches any of our registered volunteers. Pretty simple, pretty sweet.

The hardest part

The hardest part of it all was, funnily enough, presenting all of it in 5 minutes to a judge panel.

It’s fun to tweet an event, and see it getting up on our website in a matter of seconds, but the very simplicity of it, the one-click registration and login, the automatic generation of the website looks like its nothing special. And it is a complex-ish system built in just under 24 hours, so the presentation was the most stressful part of it, as you can see on the look of our teammate Ibrahim’s face:

But we did alright, after a sleepless night and a couple of hiccups, we did well enough to get an award, and we’re proud for it.

comments powered by Disqus