The Power of Graph Algorithms – The New Stack

Julian Shun

Julian Shun, lead instructor for MIT Professional Education’s Graph Algorithms and Machine Learning course, is an associate professor of electrical engineering and computer science at MIT and a senior researcher in MIT’s Computer Science and Artificial Intelligence Laboratory ( CSAIL). His research focuses on the theory and practice of parallel algorithms and programming, with particular emphasis on the design of algorithms and frameworks for large-scale graph processing and spatial data analysis. He also studies parallel algorithms for text analysis, concurrent data structures and deterministic parallelism methods. Prior to joining MIT, he was a Miller Postdoctoral Fellow at UC Berkeley. His honors include NSF CAREER Award, DOE Early Career Award, ACM Doctoral Dissertation Award, CMU School of Computer Science Doctoral Dissertation Award, Google Faculty Research Award, Google Research Scholar Award, SoE Ruth and Joel Spira for Excellence in Teaching, Facebook Graduate Scholarship and Best Paper Award to PLDI, SPAA, CGO and DCC.

A graph is a data structure composed of vertices and edges. A vertex is simply any entity represented by the data (for example: a social media user), and an edge is a relationship between these different entities.

Graph algorithms have practical applications in a wide range of fields. They can help social media companies better understand user interactions and relationships, help streamline transportation and logistics, and help financial institutions identify cases of fraud faster and more accurately. However, graph algorithms are often misunderstood, even among many data scientists – an issue that will become more pressing as the data collected and managed by organizations continues to grow.

Here are three types of stakeholders who can benefit from knowledge and experience with graph algorithms.

Data Scientists — It may seem obvious that data scientists need to understand graph algorithms. But these analytics experts often have their hands full with different types of data sets, problems, and solutions, and many just haven’t had a chance to get to grips with charts yet. For data scientists, the gap usually comes down to knowing what graph techniques exist and how to apply them to specific situations. By spending time learning and applying a dozen or two dozen specific graph algorithms, data scientists can often very quickly become adept at using these tools to solve problems within their own organization. It is also important for these experts to know how to convert raw data into the appropriate format for graph algorithms, as well as to understand what types of software tools can help them use graphs within their organizations.

Once data scientists understand some of the basics of graph algorithms, they are often able to unlock significant value. For example, a data scientist working for a social media company might use graph algorithms to discover better ways to make recommendations to users, improve service quality, and generate more revenue from ads. Meanwhile, a data scientist working in finance may use charts to model the relationships between different types of assets or between various financial institutions. In industries like manufacturing, data scientists can use charts to optimize supply chains and shipping routes.

In addition to knowing the basics of different graph algorithms, data scientists should understand how to use high-performance computational methods to process graph data, as well as visualization techniques to help stakeholders understand information. These skills will become increasingly important as the volume of data collected and managed by organizations continues to grow.

Software engineers — Different organizations will use graph algorithms in different ways, and having in-house engineering talent that understands how to make graph analysis faster and more efficient can be extremely helpful. Again, high performance computing techniques are crucial for putting graph algorithms into practice. When software engineers acquire skills in performance-oriented topics such as parallel computing, data compression, locality optimization, and algorithmic optimization, they can add significant value to graph analysis processes. of their organization.

Returning to the examples of social media and route optimization, one can see the impact of high performance computing on the use of graph algorithms. Data scientists at big social media companies like Facebook and Twitter process billions, if not trillions, of vertices and edges, and it’s just not practical to process so much data in a timely manner without leveraging high performance computing techniques. In travel and logistics, data can change rapidly with the ebb and flow of traffic, and it’s important that analytics tools are able to respond effectively to these changes in real time – rather than recalculating at from scratch whenever new data is added to a model.

Project managers — People who oversee projects related to graph algorithms may not need the same kind of in-depth knowledge that data scientists need, but they can still benefit greatly from a basic understanding of it. what are graph algorithms, how they work, and types of problems. they can be used to solve. Without this knowledge, a project manager won’t be able to make informed suggestions on different ideas for data science teams to test, or how to improve the performance of graph-related solutions. For project managers, a broad conceptual understanding of graph algorithms is more important than practical experience. Really, it’s similar to anything else: although a CEO doesn’t have to be an expert in IT or finance or human resources or marketing, it’s important that they have at least some comfort and familiarity with each of these areas. Similarly, it is important that project managers overseeing solutions that use graph algorithms understand the basics.

Organizations are now collecting and managing more data than ever before, and that volume is only going to increase every year. Graph algorithms can help organizations position themselves to create value from this data, better serve their customers, and maintain a competitive advantage.

Characteristic picture via Pixabay.

Sharon D. Cole