Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

DATA PREPARATION FOR GEPHI FROM CSV FILE

Consider the following edge list of characters in a novel.

For Gephi to read this data, we need to transform it into two separate datasheets: a “nodes” sheet and an
“edges” sheet.
The data transformation consists of two stages. In the first stage, we will create a nodes sheet, where each
node is assigned a unique Id. In the second stage, we will create an edges sheet, where all relations between
nodes are expressed as relations between Ids. Let’s name the first sheet as “Interaction” and rename the
second sheet as “nodes”.

Copy and paste all characters from Interaction sheet into the nodes sheet. Make sure that all characters are
copied into a single column in nodes sheet.
Click on non-empty cell in the Nodes sheet. Then click Remove Duplicates to remove duplicates in the
nodes sheet. This gives you a list of unique characters.

Label this column as “Label”.

In the Id column assign each character a unique Id.

Save the “nodes” sheet as csv file with name character-nodes.csv


Create a new worksheet in Excel and name it “Edges”.

Name the column as follows in Edges sheet

In “nodes” sheet copy Ids to a column on the right of Label column.

In “Interaction” sheet, use the VLOOKUP function to convert all characters in the first column to Ids.

Repeat the above VLOOKUP to convert all characters in the second column to Ids.
Copy the two columns with Ids.

Paste them as values to the Edges sheet


Save the Edges sheet as csv file and give it a name character-edges.csv

Importing csv files into Gephi


Open Gephi and click on “New Project”

In Gephi Click on Data Laboratory

In Data Table click on Import Spreadsheet


Choose character-nodes.csv from your directory.

Leave the default settings and finish the importing nodes file.
In the last screen, select Graph Type as “Undirected”. You could select “Directed” for a directed graph.
The default option of Edges merge strategy as “Average” should be fine.

Now select character-edges.csv file to import. Make sure to choose “Edges table” in the “Import as:”
dropdown list.
In the final prompt select Graph Type, “Undirected” and check Append to existing workspace.
Data has been loaded into the workspace and if you select Overview tab then you should see the
corresponding network visualization.

Now you can try to experiment with different options available in Overview tab such as: Appearance,
Layout, Filters, and Statistics to generate different network metrics and network topology. You may also
refer to online tutorials on Gephi. One good resource is: https://youtu.be/371n3Ye9vVo
To save your workspace click the save button and give appropriate file name to the project.

You might also like