Visualizing the Artists Exhibited at the MoMA, 1929-1989


This final lab project examined the artists that have been exhibited at the Museum of Modern Art from 1929-1989. The Museum of Modern Art at present contains approximately 200,000 works of modern and contemporary art, including works of architecture and design, painting, drawing, sculpture, photography, prints, illustrated books, film, and performance art. Its collections and exhibitions have continued to develop and evolve since the museum’s opening in 1929. The large and diverse scope of artists and artworks that have been exhibited at the MoMA throughout its history makes it an interesting subject to examine. The visualizations of this project specifically explore the network of artists featured together in the MoMA’s exhibitions from 1929-1989, the frequency at which artists have been exhibited during this 60 year period, and the museum’s representation of artists of male and female genders in its exhibitions. The goals of these visualizations are for users to clearly identify which artists have been most associated together, which artists have been exhibited most frequently at the museum, as well as understand the relationship of male and female artists exhibited from 1929-1989. The visualizations of this project can be beneficial for both art experts and general enthusiasts who are curious about the history of the artists exhibited at Museum of Modern Art.


Methods & Rational

This lab project consists of 4 visualizations. The dataset from which these visualizations were created from was acquired from the MoMA Github open source database. The Museum of Modern Art has made partial information about their collections, artists, and exhibitions publicly available, such as the dataset I utilized which contained information of about MoMA’s Exhibitions from 1929-1989. As this dataset contained a lot of information that was not pertinent to creating my visualizations or formatted accordingly, I utilized programs such as OpenRefine, R, and Excel to clean and reformat the data. In addition, as I was creating a variety of visualizations that required the use of two different visualization software programs, I cleaned and reformatted two separate datasets. I will explain my methods and process for the creation of these two datasets and their corresponding visualizations below.



The objective of this visualization was to understand the degree of connectivity between artists that were exhibited together at the MoMA. As this visualization was concerned with demonstrating the connections between artists, a network-map was the best method of depicting this information. In order to utilize Gephi’s network-mapping program, I had to create a dataset which included every artist featured in every exhibition held at the MoMA from 1929-1989. Once I isolated this information from the original dataset within OpenRefine, I had to transpose the data so that I could link each artist featured in the same exhibition to one another. This process was time consuming as my dataset was too large to transpose in its entirety within OpenRefine, so I had to transpose the data in segments and then later combine them within Excel.


Once the data had been formatted appropriately, R was utilized to create an edge table in which each artist in an exhibition was linked to every other artist of that same exhibition and weighted according to the number of times it was linked to that same artist. Therefore, if two artists were only exhibited together once, the edge was given a weight of 1, if exhibited together twice, then it was given a weight of 2, etc. As my dataset contained every exhibition at the MoMA over a 60 year period, the dataset was extremely large – One exhibition alone feature 320 artists. When R finally processed the entire dataset as edge tables, I had a total of 45 million rows of linked artists. Because Gephi can’t handle that much data very well, the data was broken down into subsets with edge weights greater than 1 (18 Million rows), greater than 2 (11 million rows), greater than 5 (5 million row), and greater than 10 (2 million rows). Although I tried to utilize the data with edge weights greater than 2 and greater than 5, the only data that worked well enough with Gephi on my computer was the one that featured edge weights greater than 10. I uploaded the undirected edge table with edge weights greater than 10 into Gephi, and Gephi automatically produced a node table for each artist based on my edge table. I only had to copy the artists listed in my node table into the Label column, to ensure that the names of the artists would be featured in the graph.


The network included 7285 Artists (nodes) with 288359 connections (edges). I ran statistical analyses within Gephi to help define the artists and their relationship to one another within the network – specifically I did analysis of the Average Degree, Average Weighted Degree, Network Diameter and Average Path Length. I also ran Modularity score analysis which distributed the nodes into 9 communities. As this was such a large and dense network, I chose to utilize the Forced Atlas 2 layout which prevented overlap of nodes. Because the graph was still quite dense when run through Forced Atlas 2, I next utilized the expansion layout which expanded the Forced Atlas 2 layout by a scale of 1.2. This allowed my nodes to be ledgeable while not too spread out. Next, I sized the nodes according to degree with a scale of 1 to 200. This enabled artists frequency of exhibition to be easily compared as there was a wide variety of size ranges of the nodes in the network. I then colored the nodes accordingly to modularity so that each community of artists that similarly interact together was easily defined. In the Preview section of Gephi I rescaled the weight of the edges, as the edge weights were so thick that the entire graph was un-viewable. I rescaled the weight and gave the edges a thickness of 35.0. This allowed the edges between artists often exhibited together to not be too overwhelming but clearly distinguishable in comparison to artists that are exhibited less often together. Lastly I filtered out edges that self-loop, so that any edge weights that were given for artists linked with themselves are removed from the network. The visualization was exported as a PDF file  as it was so large it needed to be viewed and explored individually as also seen below.

As there is clearly a lot going on in this large network, I thought it would be helpful for users to visualize a portion of the larger network. I therefore filtered the degree range of the network so that only the top 50 artists most exhibited were included in the network. This graph was exported as a .png file as it is not as large or detailed in comparison to the entire network visualization. This filtered network can be seen below.


The goal of this visualization was to identify the top 50 artists that were most exhibited at the MoMA from 1929-1989. As this graph was concerned with a total count over a period of time, a heatmap seemed like an appropriate visualization to use to display this information. To create this visualization through Tableau Public, I had to clean and create a dataset that included columns with the Exhibition Titles, Date of Exhibitions, and the Artists featured in each exhibition. Once cleaned and organized, I input this data into Tableau Public and created the heatmap. I began by organizing the continuous Year (Date) data into the columns section and placing Artist Name into the rows section. Next, I represented the Sum Number of Records through a color-scale, which indicated the total count of times each artist was exhibited every year from 1929-1989. A darker color indicated a higher number of total exhibitions and a lighter blue indicated a lower number of total exhibition. Then I filtered the artists included in the graph, limiting it to the top 50 most exhibited artists. I sorted the 50 artists according to descending order so that the most exhibited artists were featured at the top of the visualization. Lastly, I edited the Tooltip hover caption so that information it displayed was very clear to the user. This heatmap can be seen below.

Line Graph

The goal of this visualization was to understand MoMA’s gender representation in the artists it exhibited. As this graph compared the number of male artists exhibited within a given year to the number of female artists exhibited, a line graph was a good and clear to read visualization to demonstrate this information. This visualization also was created through Tableau Public, and utilized the same dataset used in the creation of the heatmap, which included a gender column. Within Tableau Public I created this comparative line graph by setting my Year Date in the columns section and then the Sum number of records (artists listed within every exhibition in this 60 year period) in my row section. I then placed set color according to gender, and filtered out the Null category. Next, I set my Sum Number of Records to make a quick calculation according to Percent of total, and that it should compute using the Table (down), meaning it was comparing the percent total of males to females exhibited. Lastly, I again edited the Tooltip hover caption so that information it displayed was very clear to the user. This produced the visualization that demonstrated the percent total of males and females exhibited within each year, which can be seen below.

Visualization Layouts

Once I have completed the basic designs of all four of my visualizations, I began working on the layout of my dashboard and considering how best to display these visualizations to users. After attempting to fit an .svg or .png file of the Entire Network-Map visualization into a dashboard on Tableau, I realized that I was going to have to display this large network-map separate the rest of my visualizations. While the entire network graph was not legible within the Tableau Dashboard, the top 50 artists filtered network-map was able to clearly fit into Tableau as an image. Within this Dashboard I next included the heatmap and line graph visualizations. After playing around with the sizes and locations of these three visualizations, I concluded that I wanted the filtered network-map to go at the top of the dashboard, with the heatmap next, as they correlated to one another, following with the line graph. However, I did not want the dashboard to be too large that a user would have to scroll far down to see each graph. Do I decided to have the heatmap and line graph organized side by side, with the filtered network-map above. I then realized that with so many different visualizations, each indicting different information about the artists exhibited at the MoMA, that it would be beneficial to my user to include a brief introductory description about the visualizations. I placed this introduction note to the left of my filtered network graph, and made it in large font so that it was clear to read and visibly identified as important to users. While I assumed that eyes would often first be attracted to the visualizations before the description, placing it at the top left side part of the page organized and introduced the visualizations below well. In addition, I thought it best to also include a caption underneath each visualization briefly explaining what was being shown. A brief description was also created for my separate network visualization pdf file, as having such a large and complex graph would be confusing to any user without a description. Lastly, I edited the colors of all of my visualizations. Originally, both my network-map visualizations used a bright pink color for the cluster that contained the majority of the most exhibited artists. The heatmap was a blue-color range, and the line graph indicted male artists as red and female as orange. When looking at these graphs laid-out together I quickly realized that there were no consistent colors. Even though each graph was showing a different piece of information about the artists exhibited at the MoMA, from our conversations in class and through our readings, it was clear that the differentiating colors were not helpful for users to understand the visualizations. Therefore, I chose to use a color range of blue to create consistency throughout my visualizations, especially as blue was not as hindering as other colors for anyone with color-blindness. I applied a color of blue to my main modularity cluster for my network visualizations, kept the blue color-range in my heatmap, and then changed the line of the male artists to blue in my line graph.

Artists Exhibited at the Museum of Modern Art from 1929-1989, Tableau Dashboard

Network of Artists Exhibited together at the MoMA from 1929-1989, Gephi PDF



User Research & Methods

Three users of various interests and experience within the arts were recruited to test and provide feedback on the four visualizations. User 1 has had no formal education in the history of art but is an avid museum-goer and has extensively traveled throughout Europe visiting major museums, artworks, and architecture. User 2 has had some formal education in the history of art but does not work in the Art field professionally. User 2 is a member of the MoMA and has frequently visited the museum over the years. User 3 has had formal education in the history of art and is trying to pursue career within the art field. As these Users have had various exposures to art and are familiar with museums such as the MoMA, they are good representations of the user community that would likely engage with and understand the subject of these visualizations.


User testing was conducted face-to-face at a scheduled time for approximately 20-30 minutes. Tests were one-on-one interviews in which users were asked to list the first three artists they think of when they think of the MoMA before being shown any of the visualizations. Following this question users were asked a combination of design based questions (e.g. where does your eye go to first?; Would you change any colors or color combinations?), as well as task completions (e.g. in what year were the most female artists exhibited at the museum?; What was the largest number of exhibitions per year Pablo Picasso was featured in?). The tests also included open-ended questions (e.g. what can you tell me about the relationship between the top 50 artists most exhibited?; Is there anything that stands out or surprise you about X graph?), and hypothetical content questions were asked (e.g. Would a key be helpful?). In addition, observational notes were taken as users navigated through the visualizations. It is important to note that before any questions were asked or instructions given, users were informed that this survey was not testing them personally, there were no right or wrong answers, and that this survey was testing the usability of the visualizations not them. This was extremely important to make clear to users, as UX surveys can often make users feel as though they are being tested and that there is a specific answer that I would like to hear.


Results & Findings



Network of Artists Exhibited at MoMA, Network-Map:

While the Network of Artists Exhibited at MoMA visualization is massive and therefore somewhat difficult to understand, it is clear from the shape of the network that it is a fully connected structure, or mostly fully connected. This means that if a node were removed from the network, then there would be very view isolates and the network would keep the majority of its structure and connections. If we think about the network with this in mind, it is very interesting to examine the way in which artists have been clustered into communities based on their connectivity. For example, the yellow node cluster includes artists such as Roy Lictenstein, Edward Weston, Walker Evans, Edgar Degas, Andy Warhol, David Hockney, Edward Steichen, Pierre-Auguste Renior, and Alfed Steglitz to name a few. All of these artists can be classified as modern artists. However, they do not operate within the same time periods or work with the same mediums. I would group artists such as Degas with Renior, Lictenstein, with Warhol, and Hockney, and group Weston, Evans, Steichen and Steligtz together. However, this visualization doesn’t tell us which artists have similar works, but more of which artist’s works have been exhibited along side another’s. It is clear that the MoMA from 1929-1969 has curated and exhibited artists of varying time periods, sometimes mediums, and style into the same exhibition. They have formed interesting connections and juxtapositions between artists that fall within the Modern art realm. Moreover, these artists at some point in time have been exhibited with one another or close enough.


Network of Top 50 Artists Most Exhibited at MoMA, Network-Map:

As one can more clearly see within this network-map visualization, it contains a fully connected structure. As this filtered network demonstrates, if Pablo Picasso were removed from the network, the rest of the network would still be connected. What is most interesting about what this visualization shows us that we were unable to see within the larger network is the degree of connectivity found within the weighted edges. Within this graph it is clear to identify and compare edge weight sizes to one another. For example, it is evident that Pablo Picasso and Henry Matisse are the two artists most exhibited together at the MoMA from 1929-1989. In addition to being able to identify a degree of connectivity between artists, this visualization also helps one to understand how many more times an artist was exhibited than another. For example, if we compare the size of Jackson Pollocks node with that of Picasso, Miro, or Matisse, it is clear that Pollock’s work was not nearly exhibited as frequently. The same could be said if we compared Pollock’s node size to that of Piet Mondrian, who was not that much more exhibited than Pollock. It is important to remember that this visualization does not depict popularity, but rather the historical trend of exhibiting artists.


The Most Artists Most Exhibited, Heatmap:

This heatmap depicts the top 50 artists most exhibited at the MoMA from 1929-1989, indicating at which years they exhibited and the number of times they were exhibited. Scrolling through and hovering the mouse over this visualization, it is clear to identify at which years certain artists exhibited more times than in other years or in comparison to other artists. For example, it is clear to see that Jackson Pollock’s first exhibition at the MoMA was in 1945. This makes sense as Jackson Pollock was not very well known prior to his famous drip paintings which he began creating until around this time. Therefore, with this heatmap visualization it is clear to see why the size of Pollock’s node in the previous network-map visualizations is smaller in comparison to other well exhibited artists at the MoMA.


Male Vs. Female Artists Exhibited, Line Graph:

This comparative line graph clearly shows that there were significantly more male artists than female artists exhibited at the MoMA from 1929-1989. This understanding is supported by the previous three visualizations, in which no female artists stand out or are represented on the graphs. While there is relatively no change in female artists representation at the MoMA over these 60 years, the line graph does indicate a slight decline in male artists and an increase in female artists during the 1980s. Even though this slight increase in female artists representation is interesting, I believe that this graph is more interesting for showing exactly what the other visualizations indirectly indicate, that historically there has been a lack of representation in female artists at major museums such as the MoMA.


UX Findings

User testing clearly indicated that my visualizations were user friendly and successful in their objective. I received positive feedback about the layout of my Tableau Dashboard visualizations and the clarity and helpfulness of the descriptions and captions. All of my users said that their eye first gravitated toward the Top 50 Artists Exhibited Network Visualization in the top right corner, following with the heatmap and then the line graph. This was my intended order for having the dashboard read. All users found heatmap and line graph visualizations to be clear and easy to use. I received requests from each user for the network visualization to be click or hover based, as all tried to do so when viewing the network. This made it clear the sometimes visualizations are restricted by their capabilities and not how they are designed or presented.

The only complaint/suggestion that was made on the Tableau Dashboard visualizations was from user who identified that two of the color nodes (red and pink) within the network visualization were too similar and difficult to differentiate, and that the dark blue color made the name of the artist hard to read. Another user interestingly observed during their first glance of the filtered network visualization that it reminded them of an airport hub network that showed the connected flight paths. When I asked each user to explain the relationship they saw within the filtered network visualization on the Tableau Dashboard, they all clearly identified that every artists appeared to be connected to one another, which made sense as they understood all of these artists to be influenced from one another’s work.

When asked that same question in regards to the larger network visualization, they were more confused about the relationships between artists. One user made a surprising but accurate comment about the larger network visualization, stating that the network does not show the connections between artists as well as the filtered network visualization, and is therefore a better graph for comparing which artists were exhibited more than others. I had not considered the larger network visualization in this way even though I recognized that it was larger and more difficult to read and process. All users commented that a key would be useful for understanding the group clusters within the visualization. Once I explained more information about the clustering they had a better grasp on the network and how it operated. Overall, users found each of the visualizations interesting and the network visualization was identified as looking “cool” and “pretty”, but would have liked more clarity with the larger network visualization.


Moving Forward

The user testing highlighted some clear ways in which my visualizations could be improved upon. For example, I would alter the colors used within network map so that no two colors were too similar and that the colors were light enough for labels to be read. However, most of my attention would turn towards making the larger network visualization more clear. I would provide a key for the colored node clusters which would clarify how many artists are included within each group. Additionally, I would more clearly describe the network visualization, especially what the size of the nodes and edges represent and what the cluster of groups means.

From my own observations during the user testing, I would try to make part of the network visualization click based as users seemed to be disappointed that the graph wasn’t. I would make the visualization click based by including an index of all of the artists featured on the network and including a hyper-link to their bio page on the MoMA’s website. This would also enable users to easily reference artists and their work, as well as encourage users to discover and investigate further.

Moving forward with this project I think it would be interesting to include data from 1990 until now so that the information would be complete and up-to-date. However, as the MoMA is currently withholding that information this would be an interesting study for the future.