Knowledge Graphs in E-Commerce

By Janani Balaji, Data Science Manager, The Home Depot

Getting an E-Commerce system to function reliably is not a trivial task. It is a sophisticated ecosystem that involves understanding complex dynamics between the customers, products, fulfillment methods, inventory planning, pricing, etc.

For instance, take product searches or recommendations in the online space. Knowing which products to recommend to the customer requires us to have a thorough perspective of both sides of the equation.

We need a good understanding of the customer’s needs:

  • Why is the customer here?
  • Do they prefer a certain brand?
  • What are their attributes (color, pattern)?

as well as an authority on the existing inventory

  • What kind of products do we have?
  • Their attributes?
  • Are they in stock?
  • How fast can we get them to the customer etc.?

and finally matching both to generate an experience that will resonate with the customer.

The relationship between the entities involved is multi-dimensional; hence, standard tabular data formats do not scale well to represent these complex interactions.

What is a knowledge graph?

A graph data structure, on the other hand, offers a more intuitive way to represent and visualize such multi-dimensional relationships. Unlike in other data representation formats, relationships are first-class citizens in a graph structure, making them ideal candidates for representing complex data interactions. In a graph, the entities are represented as nodes or vertices, and the relationship between them is represented as edges. The nodes and edges can have properties that provide additional information about the nodes themselves or the connection between two nodes.

At The Home Depot, we leverage knowledge graphs to represent such complex dynamics in our E-Commerce network. The figure below presents a sample graph representing the relationship between customers, products, and their attributes.

 

 

We can observe that the interactions between users, products and their attributes are easier to visualize when represented as a 3-dimensional graph. It becomes intuitive to recommend Product 3 to User1 and User 2 even though both users have not interacted with Product 3, since the relationship between Product 1, Product 2 and Product 3 becomes more easily discoverable through this graph.

Graphs can be either homogeneous or heterogeneous. In homogeneous graphs, all the entities belong to the same type, with the edges representing the relationships between the entities. An example of a homogeneous graph would be a simple product graph in which all the nodes are products, and the edges represent the co-purchase behavior of the. A heterogeneous graph on the other hand has nodes and edges belonging to several different types. The user-product interaction graph shown above is an example of a heterogeneous graph.

Applications of Knowledge Graphs in E-Commerce

 Graphs are ideal data structures for discovering unknown relationships between entities. As seen in the previous paragraph, finding related entities in a graph-based structure becomes a simple organic task of traversing its edges. By defining the path, or a sequence of edges to follow, knowledge graphs can be used to mine previously unknown entity relationships. For example, going back to the graph above, recommending related products to Users 1 and 2 can be solved by traversing the graph to find paths of type clicks - >needs. The large-scale traversal capabilities of graphs make them ideal structures to power real-time recommendations.

Knowledge graphs also enable us to use the wealth of graph algorithms to derive more sophisticated insights. By representing the data in the form of graphs, these algorithms leverage not just the information represented by the entities, but also the network generated by linking these entities through edges. For instance, graph algorithms like Community Detection help identify groups of customers that share similar behavioral characteristics. These clusters can then be used to personalize the customer experience.

Similarly, analyzing the Centrality of the graph can help identify key customers and products that drive revenue. Link Prediction, another graph algorithm that determines if a pair of entities can be related or not can be used to determine if a product would be considered by a customer. Techniques like Random Walk and Belief Propagation are frequently used to drive product recommendations at scale.

The advent of graph neural networks and graph embeddings unleashed the power of knowledge graphs in E-Commerce even beyond traversals and graph mining. Graph Neural Networks (GNNs) are neural networks built on graph data that can be used to learn optimal representations of all attributes of a graph (nodes, edges) while preserving the underlying structure.

In an E-Commerce setting, training a GNN lets us obtain high dimensional vector embeddings that contain the semantic information about the interactions between customers, products, and their attributes. These encodings are key to understanding customer needs and existing product capabilities and intelligently connecting these two. At The Home Depot, these learned representations are then used for a variety of tasks such as – finding similar products, and compatible products, identifying product taxonomy, retrieving relevant products for a given search term, retrieving similar search terms, etc.

Challenges

 While knowledge graphs offer a host of benefits, there are a few challenges in adopting them at a large scale. One of the key requirements in a Knowledge Graph is disambiguating the entities involved - which means assigning a single normalized form to uniquely identify an entity. The same product might have two different surface forms due to cultural norms or human practices. If they are not resolved to rightly represent the same entity, then incorrect relationships can be formed rendering any downstream inference tasks inaccurate. Similarly, multiple membership assignments could add layers of complexity into the graph.

Another problem in dealing with knowledge graphs lies in operationalizing and maintaining it at scale. Especially in E-Commerce, if we were to model all the interactions between the customers and products along with their attributes, the graph could easily explode in size, resulting in operational problems related to performance and maintainability.

Summary

 Knowledge graphs are important tools for demystifying complex multidimensional interactions. They let us visualize and leverage multi-dimensional relationships organically. We saw what knowledge graphs are, along with an example of how knowledge graphs can be leveraged in e-commerce. With the rapid advancements in graph storage and graph processing capabilities, graphs are taking center stage in representing complex relationships and their interactions.

However, to fully utilize the benefits organizations will have to develop a sustainable infrastructure that can support graph data storage and rendering at scale. With the rapid advancements in computing technology and strong community interest in leveraging graph knowledge, this is an exciting time to dive deep into knowledge graphs. 

SIGN UP FOR THE DSS PLAY WEEKLY NEWSLETTER
Get the latest data science news and resources every Friday right to your inbox!