Sunday, May 24, 2020

Blockchain simplified

In this article, I will attempt to give a high-level overview of Blockchain, some current and potential use-cases, and challenges with Blockchain in simple words.

What is Blockchain?
A blockchain is simply a ledger (one which records transactions), which is decentralized (that is, instead of one centralized server, exact copy stored on multiple nodes) and immutable (that is, one which cannot be altered). Blockchain makes this possible by using strong cryptographic encryption, linking/chaining of blocks, consensus protocol, and others. (In the interest of keeping this simple and high level, will skip details of all those things for now). The ledger itself can also be programmed to trigger transactions automatically.

So, what makes Blockchain useful?
Decentralization and immutability are the biggest technical benefits which translate into two big benefits from a business perspective -
  • Trust with no dependence on a central authority
  • Transparency with publicly verified transactions
Trust allows individuals, organizations, machines, and algorithms to freely transact and interact with one another with little friction. Transparency allows us to avoid intermediaries like lawyers, brokers, and bankers.

In the next section, we will look at some of the use cases to understand these benefits in more detail.

Blockchain use cases
  • Financial currency - Bitcoin is probably the most known use case of Blockchain, one that makes people believe that Bitcoin and Blockchain are interchangeable terms. Bitcoin is a digital currency that people can use to buy and pay others in the network without the need for intermediaries and central authority (like the central bank of each country). The Bitcoin transactions are stored in blocks and publicly verifiable (the entity sending/receiving coins is anonymous). Anyone can create a block (the technical term is 'mining' a block) and earns by keeping a part of Bitcoin with itself for putting the transactions on the block.
  • Social Media- Steem is a blockchain-powered social media platform (think of something like Facebook) where users are rewarded (in Steem currency) for writing content and/or liking/commenting. It promises control of your own content, no sale of user data, and free speech. The content (posts, comments, likes) is stored on the blocks and the company earns from people/companies wanting to share promoted content on the platform.
  • Cloud storage- Sia is the decentralized cloud storage platform that leverages blockchain technology to create a data storage marketplace where people can buy and pay rent for available spare hard disks on networked machines. The blocks here store transactions between the host and the renter and the company earns a small commission on every rent amount. The rented disk space offered by the platform claims to be cheaper than public cloud storage companies and lower risk because of decentralization.
  • Supply Chain- Provenance offers a platform to combat counterfeiting by providing stakeholders with a transparent, secure, and highly accurate audit trail of any item in the entire supply chain. Here, physical goods are fitted with tamper-proof RFID tags, holograms, and QR codes that get scanned through each stage of the supply chain, all of the information is recorded on a blockchain. The company charges a fixed monthly fee for on-boarding and maintaining the blockchain infrastructure.
  • Academic research - A blockchain-based system for research data could prevent data manipulation by providing a complete, transparent audit trail of all data that is collected, processed, and accessed by researchers. Any modifications made to research data would require at least 51% consensus from stakeholders and would be visible to everyone- ensuring high data quality and preventing individuals from acting dishonestly.
  • Internet of Things (IoT)- Blockchain can be used to record the sensor data information shared by various IoT devices which power smart-homes, autonomous vehicles, supply chain, automotive industry, etc. Blockchain solves the need around interoperability amongst devices of various organizations, provides a scalable solution against a centralized database, and provides additional security and audit of the data exchange between multiple devices.
As can be seen, Blockchain solves a wide variety of use cases. However, Blockchain is not for all. Next section, we briefly explore when Blockchain is not for you.

When would Blockchain not be for your use case? 
  • If your system doesn't need transparency 
  • It needs to be pretty fast
  • It's not a transaction-based system
Finally, we look at what the future holds for Blockchain.

Future of Blockchain
I found this inference from HBR on Blockchain pretty apt and therefore copying it as it is - "True blockchain-led transformation of business and government, we believe, is still many years away. That’s because blockchain is not a disruptive technology, which can attack a traditional business model with a lower-cost solution and overtake incumbent firms quickly. Blockchain is a foundational technology: It has the potential to create new foundations for our economic and social systems. But while the impact will be enormous, it will take decades for blockchain to seep into our economic and social infrastructure."

Sunday, April 26, 2020

Kafka for Business Professionals

If you are in the IT industry, I am sure you would have already heard about Kafka. There are a lot of articles around how Kafka works from a technical standpoint but very few on exactly what is the need and what use cases it serves. This article is my humble attempt at later.

Overview: Kafka as Message Queue

Let's first understand that Kafka is a message queue. What this means is that Message-Producer publishes a message onto Kafka queue from which a Message-Consumer consumes that message. Now, why is a message queue needed in the first place? The answer is that it acts as an intermediate communication layer that helps various modules aka service decouples from each other. This is the basis of a microservices-based architecture.

Kafka's Advantages

Next, let us understand how Kafka founders viewed data which is fundamental to why such a platform was created in the first place. They believed that instead of focusing on piles of data in relational databases, cache, key-value pairs, all of which are static, focus on data in real-time, as and when its captured. Let's understand this by figuring out what all data is generated for one user session on Netflix. Data is generated by one or more events corresponding to various user activities viz. when a user logs in, browsing different genre options, watching a preview, selecting and playing a movie, and then pausing it and then resuming it after a while. Now across all these generated events, appropriate actions like recommendations, resuming it on another device, have to be generated and fed back to the user in real-time. The 'real-time' in this example is where Kafka fits in. Kafka allows for events to be published and consumed by various services with high throughput (million records per second) and low latency (less than 20 milliseconds) and thus cater to high data being generated in today's systems.

Kafka wasn’t the first one in the market with this idea. We had JMS, RabbitMQ, and AMQP but what worked in favor of Kafka was higher throughput, reliability, and replication characteristics suited for today's real-time logging and analytics requirements. RabbitMQ can also process a million messages per second but requires a big cluster (30+ nodes) for in-memory operations and thus is not suitable from a hardware perspective.

Another advantage of Kafka is that it allows for on-the-fly horizontal scaling and is fault tolerance. As compared to traditional systems that are limited on scalability because of hardware limits and downtime to add new hardware, on Kafka, adding a new machine does not require downtime nor are there any limits to the number of machines you can have in your cluster. For fault tolerance, in a lot of non-distributed systems, there is a single point of failure. In Kafka, on the other hand, in a 3 node cluster, you can continue to work even if two nodes go down.

Usage

Coming onto usage in the market, according to HG insights[1], approx. 20,000 companies use Kafka including LinkedIn, Spotify, Uber, JP Morgan Chase, New York Times, Shopify, Cisco, CloudFlare, and Netflix. Let's look at some of the use cases -

  • Uber uses Apache Kafka as a message bus for connecting different parts of the ecosystem. They collect system and application logs as well as event data from the rider and driver apps viz. location coordinates of the ride and driver and use this for computing nearest vehicle, exact route taken by vehicle, computing the price, etc. They handle trillion+ (info from 2017) messages per day over tens of thousands of topics.
  • Netflix which we covered above has ~500 billion events and ~1.3 PB per day generated from video viewing activities, UI activities, Error logs, Performance events, Troubleshooting & diagnostic events
  • New York Times uses Kafka to connect multiple Content Management Systems, third-party data and wire stories on one side and a range of services and applications like search engines, personalization services, feed generators, as well as all the different front-end applications, like the website and the native apps that need access to this published content on the other side. Whenever an asset is published, it is made available to all these systems with very low latency — this is news, after all — and without data loss.
  • LinkedIn handles 7 trillion messages per day, divided into 100,000 topics, 7M partitions, stored over 4000 brokers. Kafka is used extensively throughout its software stack, powering use cases like activity tracking, message exchanges, metric gathering.
You can view [1] and [2] for more use cases.

I hope this basic info was useful!

[1] https://discovery.hgdata.com/product/apache-kafka
[2] https://blog.softwaremill.com/who-and-why-uses-apache-kafka-10fd8c781f4d
[3] https://kafka.apache.org/powered-by

Saturday, April 25, 2020

Uber and Airbnb - Both sharing economy, yet so different


Both Uber and Airbnb started in 2007 and defined what is now known as "Sharing Economy". The U.S. Commerce Department in a report [1] in June 2016 attempted to define and map out the contours of this emerging business sector as following :
  1. They use information technology (IT systems), typically available via web-based platforms, such as mobile “apps” on Internet-enabled devices, to facilitate peer-to-peer transactions.
  2. They rely on user-based rating systems for quality control, ensuring a level of trust between consumers and service providers who have not previously met.
  3. They offer the workers who provide services via digital matching platforms flexibility in deciding their typical working hours.
  4. To the extent that tools and assets are necessary to provide a service, digital matching firms rely on the workers using their own.

Both Uber and Airbnb tend to meet the above criteria. Despite this, they are different in two main respects -
The first relates to geographical scalability of marketplace [2] -
Uber’s model relies on hyperlocal network effects, i.e. the addition of a unit of supply (a driver) makes the product more valuable for the demand side (riders) within a small geographic radius. However, when Uber expands to other cities, they have to re-invest in driver acquisition without the benefit of any latent demand. Airbnb’s model, on the other hand, is built on cross-border network effects, i.e. the addition of a unit of supply (a host) makes the product more valuable for the demand side (guests) across geographic boundaries. While Uber faced local competitors like Didi in China, Ola in India, Grab in SE Asia who had replicated the same model in their own market, Airbnb, on the other hand, faced very little competition from other regional startups who had limited supply in cross-geo regions.

The second relates to the commoditization of supply [3] -
Uber’s suppliers are interchangeable (or commoditized), i.e. customers just want a ride and are not particularly sensitive to driver identity or vehicle brand. Riders too are indifferent to wait times below a certain threshold. Airbnb, on the other hand, has differentiated supply, i.e. each unit of supply is unique to some degree across a number of attributes: type of property, quality, nightly rate, location, capacity, etc. Thus, as Airbnb scaled offering choices across each variety, it became more and more difficult for its competitors to match the variety while in the case of Uber, it was much easy to build a similar network of vehicle types and drivers.

These two differences account for why valuation to funding ratio is so different for both these companies.

Credits to Sameeer Singh from Breadcrumb.vc for insights into this.

Is Zoom sustainable?

Given the usage that Zoom has seen in Covid infected world, the question is if Zoom's business model is sustainable. This article[1] from Sameer Singh explains that Zoom's core business model is based on virality of its product and not on the network effect. As compared to other video conferencing tools (Skype, Facetime) that need you to be part of a network to enjoy the features, there is no such limitation on Zoom. Zoom meeting host can share the meeting ID and anyone having the link can join. There is no attempt at forming the network in the first place which could have built a high switching cost for its users. Sameer thus feels that Zoom's business is susceptible to being run over by a similar video conference app with better customer experience or technology. 

Thursday, March 26, 2020

Technology to aid of Covid spread

Came across interesting uses of technology around Covid spread, surveillance, violations -

1 Using Location tracking via smartphones for Covid surveillance
This one talks about how Taiwan is ensuring people who have been exposed to the virus stay in their homes. The system monitors phone signals to alert police and local officials if those in-home quarantines move away from their address or turn off their phones. Officials also call twice a day to ensure people don’t avoid tracking by leaving their phones at home.

April 1 news mentioned that the system was tracking more than 55,000 people. The system has been very accurate with only about 1% of alerts being false alarms mostly because of inaccurate location readings.

2. Identifying lockdown violations (post-facto analysis)
The second one here is a report that was published to demonstrate how public data and visual AI can be used to identify lockdown violations. Taking actual images and videos from public Instagram profiles of 552,000  Italians between March 11-20, 2020, and applying image recognition technology, they were able to predict what percentage of people were not following quarantine, which city/region they belong to at an aggregate level and exactly where they were spending time (viz. parks, markets, malls). Obviously, the entire data was anonymized in the interest of privacy.



3. Using cough analysis to determine if one is Covid affected 
This link talks about using AI and Deep learning to determine if a person has Covid by analyzing the sound of the cough, the way they breathe or the way they speak. It's based on the fact that the cough of a Covid patient is distinct from a healthy person. I also stumbled upon this site https://www.coughagainstcovid.org/ which is collecting data around cough sounds to crowdsource and create such a technology. This initiative is supported by Bill & Melinda Gates Foundation and is in collaboration with Stanford University.

4. Using real-time mobile location data to detect violation of social distancing
Unacast is a company that collects and provides cellphone location data. It has aggregated all this to come up with a Social Distancing Scorecard.  This scorecard is based on analysis of information such as two devices being at same place at same time (thus violating social distance), visit to non-essential places (other than grocery), and other parameters. Unacast collects data from various apps installed on phone which track location.

5. Mapping movement of coronavirus carriers
The South Korean government is publishing the movements of people before they were diagnosed with the virus — retracing their steps using tools such as GPS phone tracking, credit card records, surveillance video and old-fashioned personal interviews with patients. The idea is to let the public know, via a central website and regional text messages, if they may have crossed paths with carriers, whose names are not made public. Here is the link to site- https://coronamap.site/


Will keep adding more as I discover. 

Wednesday, February 5, 2020

Making it easier to discover datasets

https://datasetsearch.research.google.com/

Google has made it easier to discover thousands of data repositories on the web, providing access to millions of datasets. Great help to AI, ML, NLP devs!

here is the release note -https://www.blog.google/products/search/making-it-easier-discover-datasets/

Saturday, February 1, 2020

How can WhatsApp control fake news?

India is WhatsApp's largest market in terms of net users with 400 million monthly users according to July 2019 company figures. That equates to more than one-quarter (26.7%) of its total reported user base. India is such a big market for WhatsApp which can be assessed by the fact that the second-largest market for the company is Brazil with 120 million users. In terms of user penetration among smartphone users too, India is among the top three markets for WhatsApp with more than 90% of smartphone users in India using WhatsApp.


Fake news is something that has infested WhatsApp with conspiracy theories, anti-vaccination misinformation and panicked rumours about child abductors that have even led to fatal lynching in some parts of India. WhatsApp's previous attempts to contain fake news included steps like limiting the number of times a message can be forwarded to five and a visual indicator to indicate that it has been forwarded. However, limits on the forwards slows the spread of fake news but doesn’t curtail it and if media reports are to be believed there are software tools for as cheap as Rs 1,000 that let you bypass WhatsApp’s forward restrictions.

So what can WhatsApp do? The underlying solution lies in the form of a mix of AI and NLP. NLP or Natural Language Processing can help identify the text that the user has shared in his/her message and  AI or Artificial Intelligence can be used to match it to an offline database of fake news to infer if a particular shared message is fake or not. For images, auto text extraction from images and for videos, frame by frame analysis and speech to text will have to be used. As I understand, using 'CheckPoint Tipline' that was initiated by WhatsApp in April 2019 and discontinued later in the year, the company has already crowdsourced (collected) enough data about fake news and probably has the offline database (technically a model in AI) ready.

With an AI model in place, WhatsApp can thus provide automatic checking of all messages before it reaches you and notify a user that received message is fake. User can also be provided an option to report a suspicious message as fake which behind the scene sends a message to WhatsApp for verification and let them add it to their database post verification.


In my suggestion, I emphasize notifying a user instead of auto-removal for two reasons -

  • Results from a study show that participants who were exposed to a correction of any kind were significantly less likely to believe the false information posted by the first user, relative to those who do not receive a correction. 
  • Users should be given the power to know and understand which messages were fake and at the same time provide them an opportunity to refute the WhatsApp claim.


There are quite a few challenges in this approach, the biggest of which pertains to the fact that messages between the sender and receiver are encrypted and even WhatsApp cannot see the content of the messages. This calls for change in encryption policy and while fake messages can be filtered, it opens up WhatsApp to government agencies who might want to sniff on your messages. Also, any learned AI model would have to be constantly adapted to the new strategies and techniques of disinformation. Further, the model will have to be sufficiently trained over a wide range of languages as messages on WhatsApp are shared across various languages.

Friday, January 24, 2020

Gandhi- the Mahatma - Book review of My Experiment with Truth

"The Story of My Experiments with Truth" by MK Gandhi is his autobiography or as he puts it, his experiments with truth, starting from his childhood days to his days in England to the starting of Satyagraha in South Africa to the beginning of Independence movement in India around 1920.

Before reading this book, I thought he was called Mahatma because he steered India's freedom struggle but I now realize, it was his endeavor towards the welfare of the people that made him earn this respect from people. Through multiple stories from his life, I understand the aim of his life was to try to discover and embark on the path to Truth and attain Moksha. His ashram first set up in South Africa and then later in Gujarat was aimed to lead by example where everybody voluntarily shared all work irrespective of caste and creed. He would take up cases for people for free, raise social and political causes with authorities, challenge the status quo and try to reform people around him. Few would know that during the Boer war in South Africa, Gandhi went to the battlefield as a volunteer for ambulance corps ferrying the wounded on a stretcher.

He was a big proponent of Brahmacharya but his idea of the same included not only monogamy but also simplicity of clothes and the food one eats. Despite being a Barrister, Gandhi gave up his European dressing, started to live in a simple home, traveled in the 3rd class in train/ship and choose to donate whatever extra money he had apart from covering his basic expenses. He has dealt extensively about his experiments with what we today call as 'vegan diet'.

We all pretty much know about Gandhi’s take on standing for truth. His idea was hate the sin but not the sinner coming from the notion that we are all created from one single Creator and hating a sinner is the same as hating that ultimate Creator. At another place, he says, it is the reformer who is anxious for the reform, and not the society, from which he should expect nothing better than opposition, abhorrence and mortal persecution. His idea of non-violence through Satyagraha is I guess the most difficult thing to do. It takes a lot of courage to hold back one’s natural instinct which is to take revenge and instead face opponent gracefully.

All the things that Gandhi tried didn’t come to him overnight. He constantly experimented with ideas, adapted those before embracing those to his life.

Gandhi has painted a very sober portrayal of himself in the book which I guess is the mark of a true learned person. He has highlighted how he was an average student in the class, how he failed at his first practice in Bombay and how he had great difficulty in speaking impromptu including the ones where he was an organizer.

Overall, it’s a very inspiring read and there cannot be a better quote than what Einstein said for Gandhi “Generations to come will scarce believe that such a one as this ever in flesh and blood walked upon this Earth”

Saturday, January 4, 2020

Trip to Vizag - Beaches, Caves and Eastern Ghats

I had a chance to see Vishakapatnam (also known as Vizag) from 29 Dec 2019- 2 Jan 2020.



Our trip started from an early (really early :-) at 5:20 AM) morning flight from Delhi to Vizag. With alarm set for 2:30 AM, the anxiety of missing the alarm didn't allow me to sleep much in the night. Nevertheless, the flight started before time and reached before time as well. There is a good view of Bay of Bengal as plane approaches the airport.






I had pre-booked a ZoomCar for 2 days and from the airport, headed to the Zoomcar location to collect the car. Once done with the usual inspection of the car, it started with a trip to Araku. The distance from Vizag to Araku is about 110 km of which half is plain road and remaining is curved around hills. There aren't many roadside dhabas or restaurants on the way. The road on the hills is 2 lane and we occasionally got stuck behind buses waiting for traffic to clear. Maybe there was more rush due to Sunday but it took us approx 5 hours to cover the journey which should have normally taken 3 hours. We arrived at Haritha Valley Resort and after having the buffet lunch in their restaurant, we were all tired and took a quick nap.

In the evening, we went to see the Tribal Museum and Coffee Museum, both within walking distance from our resort. The tribal museum is pretty good and shows the tribal life. The Coffee Museum is average and its more of a shopping arcade than a museum. I avoided seeing Dumriguda Waterfalls (simple waterfall) and Katiki Falls (1-hour walk to reach here and not possible with a small kid) in Araku.






The next day, we started from Araku Valley to Borra Caves and reached in about 90 minutes. These caves are probably the largest in India and are of limestone structure. We took a guide here who showed us various God, animals, human shapes formed naturally from limestone. The Gosthani River, which originates from these caves and flows between is the cause for the development of the odd shapes of structures. The name "Borra" means hole and popular legend is that a cow, grazing on the top of the caves, dropped through that hole in the roof and was found across the river and saved because of Lord Shiva.



At the exit of Borra Caves, I tried the bamboo chicken. Its regular marinated chicken cooked in Bamboo. Contrary to what I had read, I didn't find that any great with respect to regular barbecued chicken but it was good and reasonably priced. 
From Borra to Vizag, we decided to take a detour and see Rushikonda beach before we return the car. The beach was very beautiful and by and large clean. After returning car, we checked in to Dolphin Hotel. 

A note about the hotel, very hospitable staff - everyone makes you feel special. Awesome breakfast with multiple options to choose from, well maintained swimming pool and a trainer to assist you in the gym.



Heading back to the trip, next day we went out to see Kailasagiri. We took a cable car to reach the top of the hill. Saw the wonderful Shiva-Parvati statue on the top of the hill. The hill also provides an awesome view of the Vizag beaches -  



Next, we saw the INS Kursura Museum. This is one of the must-see things in Vizag and contains the actual submarine used by the Indian Navy from 1969 to 2001 and now preserved as a museum for public access. Having seen the movie "Ghazi Attack", we could relate every part of the submarine and thoroughly enjoyed it. 







Next day we saw TU-142 Aircraft Museum which like submarine museum is a preserved Tupolev TU-142 aircraft which served Indian Navy for 29 years. This museum was established in 2017 by President Kovind and besides the actual walk-through of the aircraft, provides good information through a free audio guide of various parts of the plane.

Post that, we saw RK beach. Being near to the city, its more crowded. However, if one were to stray slightly away from main beach, one can avoid crowd and enjoy the wonderful sea. Long coastline offers long walks. And I could imagine myself reading a book sitting on the beach away from the general hustle bustle of the city.









Last day, since it was Gurupurab, we went to local gurudwara and with God's grace, had langar. Post that, we did some shopping (actually just from one store, Leepakshi) and then headed to the airport for our flight back to Delhi.

As for people in Vishakapatnam, I found them very friendly. There is some language issue but most people can understand and speak broken Hindi (over English). Some streets were crowded but there was no road rage to be seen or people in unnecessary hurry. Wherever we went and at whatever time of the day, we were off the groping eyes generally seen in North India.

Food is cheap and all walks of people enjoy street food. In a decent restaurant, one will be charged the MRP of cold-drink and not some exorbitant price just for putting that cold drink in a glass.

Finally, beautiful drawings in front of houses and shops are a treat to eyes.