116. Infra, platforms, and knowledge graphs


“Software is Feeding the World” is a weekly newsletter for Food/AgTech leaders about technology trends.

Infra, platforms, and knowledge graphs

LA Confidential'' (1997) is one of my favorite movies. It is set in 1950s Los Angeles and tells the “story of a group of LAPD officers in 1953, and the intersection of police corruption and Hollywood celebrity.”

At the very beginning of the movie, the mayor of Los Angeles boasts about plans to build freeways in LA.

The Arroyo Seco freeway is just the beginning. We're planning freeways from Downtown to Santa Monica, from the South Bay to the San Fernando Valley. Twenty minutes to work or play is the longest you'll have to travel.

In mid 1950s, the US President Dwight D. Eisenhower kicked off the national highway system in the US, with the Federal Aid-Highway Act of 1956, sanctioning a highway system of 41,000 miles of highways.

The infrastructure of the US highway system has had a profound impact on the US economy, history, and culture. According to a study commissioned in 1996 (40th anniversary of the act of 1956), some of the benefits of the highway system infrastructure were as follows:

  • From 1950 to 1989, approximately one-quarter of the nation's productivity increase is attributable to increased investment in the highway system.
  • Improved road safety, expanded mobility, reduced transportation and logistics costs, and led to suburbanization of the US.

Some of the stats above indicate the outsized impact of the road network infrastructure investments. Building infrastructure can be less cool, but its impact is much larger.

Bill Gates once said, “a platform is when the economic value of everybody that uses it, exceeds the value of the company that creates it.”—

Going to the analogy of the US freeway system infrastructure, if you pave your unpaved driveway, you are not building infrastructure. It is a specific application, though you might call it a platform.

There is significant talk of platforms within agriculture. You go to an agtech show and every company talks about how they are a platform. Most of them are not platforms, but a collection of applications which solve a specific problem.

In edition 83. Picks and Shovels, I had written about the partnership between Bayer and Microsoft, and had speculated if the combination could create a platform of capabilities to power applications within agtech. I had proposed a technology stack for agriculture as shown below.

As you move higher on the technology stack, you move from tools to specific applications like FieldView or other 3rd party applications.

  • The lower you are on the stack, the tools have broad applicability, and tend to be a commodity. For example, compute and storage has become a commodity in the last 20 years, and the main computer and storage providers can sell it to any size company (and individuals) in any industry in the world.
  • As you go higher, the tools as well as the applications become more specific to the agriculture industry, and so are applicable to a smaller set of customers, but you can create and capture more value per customer.

As I had indicated before, there are not many true technology platforms within agriculture, which accelerate innovation within the industry.

Cropin Cloud

It is fun to see the launch of Cropin Cloud from Cropin. Cropin Cloud is a cloud platform with integrated applications.

Today, the company announced the launch of Cropin Cloud, a cloud platform with integrated apps. Founded in 2010, Cropin’s other products are live in 92 countries, it is partnered with over 250 B2B customers and it has digitized 26 million acres of farmland. It claims the world’s largest crop knowledge graph of more than 500 crops and 10,000 crop varieties.

Cropin reports annual revenue of $ 20 million (between 15 and 25 million), for 250 B2B customers with an average earning of $ 80K per customer. If Cropin’s revenue is purely software based, $80K per customer

Based on this report, Cropin Cloud has a reasonably sized footprint (26 million acres), though it is dwarfed by the hundreds of millions of acres mentioned by Deere, The Climate Corporation, Syngenta, etc. Given that they operate in 92 countries, one has to assume their experience in working with a variety of crops gives them access to some specialized knowledge and interesting data sets.

Cropin Cloud claims that it can be used by agribusiness of all sizes. The organizational and technical skill sets required to serve a large organization are very different from the ability to serve a large number of small organizations. So I am taking this claim from Cropin with a healthy dose of skepticism.

Cropin has three sub-platforms that allow farmers and other stakeholders in the food value chain to access tools for earth observation, remote sensing and data and machine learning to help them better manage crops and harvests.

What Cropin claims as three sub-platforms (see below) are actually different parts of a solution stack which includes digitization applications. This could be anything from a data collection process to applications to solve specific problems. Cropin Apps feels similar to Climate’s FieldView, but for the smallholder market and a much larger variety of crops.

Cropin’s data hub includes unified data from many different sources like remote sensing, precision equipment, IoT etc. The data hub includes ML-ready data pipelines for enhanced analytics. It interfaces with any and all data surfaces. CropIn data hub feels similar to John Deere Ops center with connections with other applications.

CropIn Intelligence will have access to a library of well tested contextual models. Cropin lists about 22 different models like crop detection, yield estimation, irrigation scheduling, pest and disease prediction, nitrogen uptake, water stress detection, harvest date estimation, change detection, and plot score, among others. Some of the models like pest and disease prediction, yield estimation, nitrogen update, harvest date estimate, are fairly sophisticated and it is difficult to make them work in a variety of situations and conditions.

The platform aspects of Cropin mentioned are its ability to “help cut down data engineering efforts by upto 80% for organizations & enterprises that need to build similar data integrations for intelligent insights on their own.”

Cropin’s value prop could potentially include the ability to collect, clean, and contextualize data, which is a very hard problem to solve. It enables combining enterprise, agronomy, & remote field data with Cropin’s Crop Intelligence models and datasets to surface significant insights and value for the business while sparing the need to expend critical resources on re-inventing the wheel.

Built using the world's most extensive crop knowledge graph, these models have been field-tested and deployed by over 250 public and private sector enterprises worldwide.

Knowledge graphs

Cropin mentions the use of a crop knowledge graph with more than 500 crops, and hundreds of varieties to build agronomically valuable models.

So what is a knowledge graph?

Knowledge graphs represent a collection of interlinked facts about a domain (a subject area, for example agriculture). Entities and relations are extracted from the unstructured data and stored in the form of a triple: subject-predicate-object.

Knowledge graph combines graphs with machine learning and artificial intelligence. It p

When I worked at Amazon Kindle, we ran an experiment to feed all the Harry Potter books to an algorithm. The algorithm identified characters in the book (for example, Ron, Hermione, Harry, etc.) by just “reading the book.” As a next step to surface interesting insights, we counted the number of words between the mention of two character’s names, a histogram by number of words between character mentions, and then plotted it in a bubble chart.

For example, (not a real sentence from the Harry Potter canon), if there are three sentences like

Ron was happy to see Hermione. Ron had been busy at the ministry of magic during summer and so had not had a chance to talk to Hermione. Ron missed Hermione.

In the first sentence, there are 4 words between their names, in the second sentence 20 words, and in the 3rd sentence 1 word. You run this exercise for the entire canon, and all the characters. Based on the height of the histogram (frequency of having those two characters together), and the density of the histogram, you can start to see which characters are together the most in the book, and how close they are to each other. Without reading the book, you can know that Ron, Hermione, and Harry are important characters, and they have a strong bond with each other, compared to say Luna Lovegood and Mrs. Weasley

This would be an example of a simple knowledge graph created from raw unstructured data coming from the book. As far as I know, knowledge graph is a relatively new concept, and was introduced by Google in 2012.

For example, Google has a knowledge graph for sustainability. It combines structured and unstructured data from a variety of sources (see chart below) and tries to “learn” the context within the data, and find new knowledge which is not readily visible or comprehensible.

Image source: Google Blog

In 2020-21, there was a huge war in the note-taking community between Roam Research, Obsidian, and Notion. Personal Knowledge Management (PKM) systems think bois were pushing for connected systems for notes, by drawing upon the experience of the “Zettelkasten” system (slip box system of interconnected notes). PKM fanatics were showcasing how a connected note taking system helps you find connections which never existed.

Example graph of connected notes (Source)

The graph of connected notes in your PKM system is not a true knowledge graph, as it structurally understands how different notes are connected, but not semantically, as it cannot provide context and explainability. This task is left to the note taker.

A knowledge graph provides additional value by providing context, efficiency, and explainability. (instead of a black box algorithm). Knowledge graphs have been used successfully in fraud detection, drug discovery, and other processes which combine structured and unstructured data.

An early example of a knowledge graph in literature (though not mentioned as a knowledge graph) is when Sherlock Holmes talks about his brother Mycroft in one of my favorite Holmes short story, “The Adventure of the Bruce Partington Plan

We will suppose that a minister needs information as to a point which involves the Navy, India, Canada and the bimetallic question; he could get his separate advices from various departments upon each, but only Mycroft can focus them all, and say offhand how each factor would affect the other.

Mycroft gets structured and unstructured data from different sources, is able to connect them together, understand the context, and then come up with key insights!!

Knowledge graphs in agriculture

How can knowledge graphs work in agriculture?

Knowledge graphs can incorporate both structured (for example, coming from a spreadsheet, or precision agriculture equipment) and unstructured data (a twitter feed, images, YouTube video, bulletin board information, books etc.) Knowledge graphs can be successful and valuable if they can uncover new insights by automatically incorporating new data sources, understanding the context, finding new connections, and continuously evolving and learning.

Building a data set of crops and varieties is a necessary and an early step to building a valuable knowledge graph in agriculture. It is an extremely hard challenge to go from data, to context, to connections, to new and surprising insights using knowledge graphs. It will take some unknown (aka long) amount of time.

In the News

Agronomy

What is Nano Urea and does it really work?

Pivot Bio has launched an entirely new class of products that integrates nitrogen seamlessly with the seed during planting

Robotics and Automation

Is Silicon Valley coming for farm workers’ jobs?

The robots are here?! But only for weeds

Augmenta Mantis, a computer vision-based agriculture tool provides an average of 10% of savings on inputs

Trimble collaborates with CLAAS and acquires French start-up Bilberry to push the envelope on precision agriculture and selective spraying systems

AgTech

Top 6 Questions (and Answers) Every Agribusiness Should Be Asking About Cyberattacks and Data Security

How to create an integrated and cohesive marketing strategy? Segmentation, channel management, pricing.

Is Agtech the future of the fertilizer industry?

GHX (Golden Harvest brand from Syngenta) with ServiceSquad includes customer service, in-person, on the phone and via collaborative tools on GHX Mobile. From product selection and purchasing to in-season progress and harvest analysis, GHX Mobile provides tools and insights that allow for seamless seed management and decision-making throughout the season.

Predictive maintenance for the win!

Ag giants announce collaboration in the fight against herbicide resistant weeds

Africa offers new opportunities for crop input companies

New credit facility opens up lending £250m to small and medium agricultural businesses until 2023

Pattern Ag total funding goes to $ 60 million, to build the largest soil metagenomics dataset in agriculture

Sustainability

“The societal and environmental costs of soil loss and degradation in the United States alone are estimated to be as high as $85 billion every year.”

CarbonNOW program provides guaranteed payments!!

Shipping accounts for about 3% of global carbon dioxide emissions. Ammonia could help clean up world shipping

Drought ravages cotton, and it could get worse!

What do you think?

💗 If you like “Software is feeding the world”, please share with a friend.

🙏 If you don’t mind answering 3 questions anonymously (2 are optional), I would love to get your feedback.

About me

My name is Rhishi Pethe. I lead the product management team at Project Mineral (focused on sustainable agriculture). The views expressed in this newsletter are my personal opinions.

Rhishi Pethe

Agriculture and Technology or AgTech

Read more from Rhishi Pethe

“Software is Feeding the World” is a weekly newsletter about technology trends for Food/AgTech leaders. Greetings from the San Francisco Bay Area after a long’ish break. Due to a technical issue, today’s edition is coming out later than normal. I hope to go back to normal operations starting from next week. Now onto this week’s edition. There has been significant talk about Large Language Models (LLMs) like Bard and ChatGPT recently. My friend Shane Thomas did a fantastic primer on the...

“Software is Feeding the World” is a weekly newsletter about technology trends for Food/AgTech leaders. Greetings from the San Francisco Bay Area. Interoperability is often on people’s minds when it comes to agriculture data. I have written about it over the past three years, and it is time to do a refresher again. Image source Potential problems with interoperability in agriculture data Interoperability in agriculture data refers to the ability of different agricultural systems and software...

“Software is Feeding the World” is a weekly newsletter about technology trends for Food/AgTech leaders. Greetings from the San Francisco Bay Area. The rain has taken a breather and hopefully is on its way out. My Work World Agritech San Francisco 2023 reflections World Agritech 2023 in San Francisco is behind us. I published some of my reflections from the event on my blog. I talk about my reasons to continue to go to the event, my 5 key takeaways from the event (independent voices matter,...