Sailor Shift: Rise and Resonance

Author

Linsen Lu

Getting Start

Installing and loading the required libraries

pacman::p_load(igraph, tidygraph, ggraph, 
               visNetwork, lubridate, clock,
               tidyverse, graphlayouts, 
               concaveman, ggforce, jsonlite, dplyr)

Importing data

t_data <- fromJSON("data/MC1_graph.json",
                   simplifyDataFrame = TRUE)

1. Introduction

Sailor Shift is one of the most influential figures in the development of “Oceans Folk” music. From her humble beginnings as a singer on Oceanus Island to her current status as a global superstar, she has grown to represent not only her own personal success, but has also propelled this niche genre into the world. This project uses data analysis and visualization to delve deeper into her network of collaborations, musical influences, and her importance in the overall music ecosystem. We will reveal how she has influenced others and been shaped by the zeitgeist, and further reflect on what her rise reveals about the new generation of musicians.

2. Data processing

2.1. Extracting Edges and Nodes

nodes_tbl <- as_tibble(t_data$nodes)
edges_tbl <- as_tibble(t_data$links) 

2.2. Get closer to data

2.2.1. Edges

glimpse(edges_tbl)
Rows: 37,857
Columns: 4
$ `Edge Type` <chr> "InterpolatesFrom", "RecordedBy", "PerformerOf", "Composer…
$ source      <int> 0, 0, 1, 1, 2, 2, 3, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5…
$ target      <int> 1841, 4, 0, 16180, 0, 16180, 0, 5088, 14332, 11677, 2479, …
$ key         <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
length(unique(edges_tbl$`Edge Type`))
[1] 12
unique(edges_tbl$`Edge Type`)
 [1] "InterpolatesFrom"   "RecordedBy"         "PerformerOf"       
 [4] "ComposerOf"         "ProducerOf"         "InStyleOf"         
 [7] "LyricalReferenceTo" "CoverOf"            "DistributedBy"     
[10] "MemberOf"           "LyricistOf"         "DirectlySamples"   

The edges dataset contains 37,857 records and 4 fields to represent the various relationships between entities in the network. Each edge contains the node IDs (source and target) of the starting and ending points, as well as 12 Edge Types describing the nature of the relationship, such as “PerformerOf”, ‘ComposerOf’ or “RecordedBy”. Meanwhile, the key field is used to distinguish between multiple connections between the same node pair.

2.2.2. Nodes

glimpse(nodes_tbl)
Rows: 17,412
Columns: 10
$ `Node Type`    <chr> "Song", "Person", "Person", "Person", "RecordLabel", "S…
$ name           <chr> "Breaking These Chains", "Carlos Duffy", "Min Qin", "Xi…
$ single         <lgl> TRUE, NA, NA, NA, NA, FALSE, NA, NA, NA, NA, TRUE, NA, …
$ release_date   <chr> "2017", NA, NA, NA, NA, "2026", NA, NA, NA, NA, "2020",…
$ genre          <chr> "Oceanus Folk", NA, NA, NA, NA, "Lo-Fi Electronica", NA…
$ notable        <lgl> TRUE, NA, NA, NA, NA, TRUE, NA, NA, NA, NA, TRUE, NA, N…
$ id             <int> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 1…
$ written_date   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "2020", NA, NA,…
$ stage_name     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ notoriety_date <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…

The nodes dataset contains 17,412 entries, each representing an entity within the music network and categorized under the Node Type column as “Person”, “Song”, or “RecordLabel”. Each node includes relevant attributes based on its type—for example, songs have fields such as single, release_date, genre, and notable, while people may have stage_name and notoriety_date. The presence of missing values (NA) in many fields indicates that certain attributes are only applicable to specific node types.

2.2.3. Initial EDA

ggplot(data = edges_tbl,
       aes(y = `Edge Type`)) +
  geom_bar()

This bar chart above shows the distribution of different edge types in the music relationship network. The most common type is PerformerOf, indicating that the data heavily captures who performed which work. Other frequent types include ComposerOf, LyricistOf, and ProducerOf, highlighting the importance of creative and production roles. In contrast, relationships like MemberOf and DirectlySamples are less common, suggesting these connections are either rarer or less documented.

ggplot(data = nodes_tbl,
       aes(y = `Node Type`)) +
  geom_bar()

This bar chart displays the distribution of different node types in the music network dataset. The most common type is Person, with a count far exceeding other categories, indicating a strong focus on individual artists, producers, and contributors. Songs also appear in large numbers, highlighting the dataset’s emphasis on works being created or performed. Other types like Albums, RecordLabels, and MusicalGroups are present but in significantly smaller quantities.

3. Creating Knowledge Graph

3.1. Mapping from node id to row index

id_map <- tibble(id = nodes_tbl$id,
                 index = seq_len(
                   nrow(nodes_tbl)))

3.2. Map source and target IDs to row indices

edges_tbl <- edges_tbl %>%
  left_join(id_map, by = c("source" = "id")) %>%
  rename(from = index) %>%
  left_join(id_map, by = c("target" = "id")) %>%
  rename(to = index)

3.3. Filter out any unmatched (invalid) edges

edges_tbl <- edges_tbl %>%
  filter(!is.na(from), !is.na(to))

3.4. Creating tidygraph

graph <- tbl_graph(nodes = nodes_tbl, 
                   edges = edges_tbl, 
                   directed = t_data$directed)
class(graph)
[1] "tbl_graph" "igraph"   

4. Visualising the knowledge graph

set.seed(1234)

4.1. Visualising the whole graph

ggraph(graph, layout = "fr") +
  geom_edge_link(alpha = 0.3, 
                 colour = "gray") +
  geom_node_point(aes(color = `Node Type`), 
                  size = 4) +
  geom_node_text(aes(label = name), 
                 repel = TRUE, 
                 size = 2.5) +
  theme_void()

4.2. Visualising the sub-graph

4.2.1. Filtering edges to only “MemberOf”

graph_memberof <- graph %>%
  activate(edges) %>%
  filter(`Edge Type` == "MemberOf")

4.2.2. Extracting only connected nodes (i.e., used in these edges)

used_node_indices <- graph_memberof %>%
  activate(edges) %>%
  as_tibble() %>%
  select(from, to) %>%
  unlist() %>%
  unique()

4.2.3. Keeping only those nodes

graph_memberof <- graph_memberof %>%
  activate(nodes) %>%
  mutate(row_id = row_number()) %>%
  filter(row_id %in% used_node_indices) %>%
  select(-row_id)  # optional cleanup

4.2.4. Plotting the sub-graph

ggraph(graph_memberof, 
       layout = "fr") +
  geom_edge_link(alpha = 0.5, 
                 colour = "gray") +
  geom_node_point(aes(color = `Node Type`), 
                  size = 1) +
  geom_node_text(aes(label = name), 
                 repel = TRUE, 
                 size = 2.5) +
  theme_void()

5. Sailor Shift’s Career Connections

5.1. The contributors who shaped the modern Sailor Shift

A singer’s journey to fame is never a solitary one. Sailor has been accompanied by many — fellow singers, producers, instrumentalists, composers, and others who helped shape her path.

# Sailor Shift's Index
sailor_idx <- which(nodes_tbl$name == "Sailor Shift")
# Sailor Shift's works'Index
perf_edges <- graph %>%
  activate(edges) %>%
  as_tibble() %>%
  filter(`Edge Type` == "PerformerOf", from == sailor_idx)

sailor_works_idx <- perf_edges %>% pull(to) %>% unique()

focus_idx1 <- unique(c(sailor_idx, sailor_works_idx))
# Keep Edges that 'influence' Sailor Shift's works
influence_types1 <- c("ComposerOf", "ProducerOf", "LyricistOf", "CoverOf")

graph_influence1 <- graph %>%
  activate(edges) %>%
  filter(
    `Edge Type` %in% influence_types1,
    to %in% focus_idx1
  )
# Extract Nodes 
used_node_indices1 <- graph_influence1 %>%
  activate(edges) %>%
  as_tibble() %>%
  select(from, to) %>%
  unlist() %>%
  unique()
# Keep Nodes
graph_influence1 <- graph_influence1 %>%
  activate(nodes) %>%
  mutate(.row = row_number()) %>%
  filter(.row %in% used_node_indices1) %>%
  select(-.row)
# Plot
ggraph(graph_influence1, layout = "fr") +
  geom_edge_link(aes(color = `Edge Type`),
                 arrow    = arrow(length = unit(4, "pt"), type = "closed"),
                 end_cap  = circle(3, "pt"),
                 start_cap = circle(3, "pt"),
                 width    = 0.5,
                 alpha    = 0.6,
                 show.legend = TRUE) +
  geom_node_point(aes(color = `Node Type`),
                  size = 2) +
  geom_node_text(aes(label = name),
                 size         = 2.5,
                 repel        = TRUE,
                 max.overlaps = Inf) +
  scale_edge_colour_brewer(palette = "Set2",
                           name    = "Edge Type") +
  scale_color_manual(values = c(
    "Person"      = "#377EB8",
    "Album"       = "#E41A1C",
    "RecordLabel" = "#4DAF4A"
    ), name = "Node Type") +
  theme_void() +
  theme(
    legend.position = "right",
    legend.title    = element_text(size = 10),
    legend.text     = element_text(size = 8),
    plot.margin     = margin(5, 5, 5, 5)
  )

This network diagram places Sailor Shift at its center and reveals the diverse teams behind each album. By mapping the ComposerOf, ProducerOf and LyricistOf relationships, it clearly shows which composers, producers, and record labels have shaped her work. From the visualization, it’s clear that Ewan MacRae has had the greatest influence on her discography: he not only composed Oceanbound alone but also teamed up with Freya Lindholm and Astrid Nørgaard to co-create Coral Beats, leaving a significant mark on two albums—far more than any other contributor.

5.2. Who did Sailor Shift influenced

Throughout Sailor’s career, not only has Sailor received influences from others, but her work has begun to inspire others, extending her creative reach beyond her immediate circle.

# Sailor's works
layer1_targets <- perf_edges %>%
  pull(to)
# Works influenced by Silor's works
influence_types2 <- c("DirectlySamples", "InStyleOf",
                     "LyricalReferenceTo", "InterpolatesFrom")

layer2_targets <- graph %>%
  activate(edges) %>%
  as_tibble() %>%
  filter(`Edge Type` %in% influence_types2,
         from %in% layer1_targets) %>%
  pull(to)
# Creators of those influenced works
creator_types <- c("ComposerOf", "ProducerOf", "LyricistOf")


graph_sub2 <- graph %>%
  activate(edges) %>%
  filter(
    (`Edge Type` == "PerformerOf" & from == sailor_idx) |
    (`Edge Type` %in% influence_types2 & from %in% layer1_targets) |
    (`Edge Type` %in% creator_types & to %in% layer2_targets)
  )


used_nodes2 <- graph_sub2 %>%
  activate(edges) %>%
  as_tibble() %>%
  select(from, to) %>%
  unlist() %>%
  unique()


graph_sub2 <- graph_sub2 %>%
  activate(nodes) %>%
  mutate(.row = row_number()) %>%
  filter(.row %in% used_nodes2) %>%
  select(-.row)



ggraph(graph_sub2, layout = "fr") +
  geom_edge_link(aes(color = `Edge Type`),
                 arrow      = arrow(length = unit(3, "pt"), type = "closed"),
                 end_cap    = circle(2.5, "pt"),
                 start_cap  = circle(2.5, "pt"),
                 width      = 0.6,
                 alpha      = 0.7) +
  geom_node_point(aes(color = `Node Type`), size = 3) +
  geom_node_text(aes(label = name), repel = TRUE, size = 2.5, max.overlaps = Inf) +
  scale_edge_colour_manual(values = c(
    PerformerOf        = "#8DD3C7",
    DirectlySamples    = "#FB8072",
    InStyleOf          = "#80B1D3",
    LyricalReferenceTo = "#FDB462",
    InterpolatesFrom   = "#B3DE69",
    ComposerOf         = "#FCCDE5",
    ProducerOf         = "#BEBADA",
    LyricistOf         = "#FFED6F"
  ), name = "Relation") +
  scale_color_manual(values = c(
    Person       = "#377EB8",
    Album        = "#E41A1C",
    Song         = "#4DAF4A",
    RecordLabel  = "#984EA3",
    MusicalGroup = "#FF7F00"
  ), name = "Node Type") +
  theme_void() +
  theme(
    legend.position = "right",
    legend.title    = element_text(size = 10),
    legend.text     = element_text(size = 8)
  )

This visualization adopts a four-layer peeling approach: at the very center sits Sailor Shift (blue), surrounded by her own recordings and lyric-penned tracks (red and green). The third ring maps the songs that directly sample, stylistically echo, lyrically reference, or interpolate her work (green), and the outermost layer identifies the composers, producers, and lyricists (blue) behind those derivative pieces. By counting connection frequencies, Wei Zhao stands out as the most heavily influenced creator—appearing under two separate derivative tracks—making them the single individual most shaped by Sailor Shift’s musical legacy.

5.3. Sailor Shift‘s influence to the Oceanus Folk community

# Sailor's Index
sailor_idx <- which(nodes_tbl$name == "Sailor Shift")
# Sailor's works
creative_edge_types <- c("PerformerOf")

perf_edges <- graph %>%
  activate(edges) %>%
  as_tibble() %>%
  filter(`Edge Type` %in% creative_edge_types, from == sailor_idx)

sailor_works_idx <- perf_edges %>% pull(to) %>% unique()
nodes_tbl[sailor_works_idx, ]
# A tibble: 26 × 10
   `Node Type` name         single release_date genre notable    id written_date
   <chr>       <chr>        <lgl>  <chr>        <chr> <lgl>   <int> <chr>       
 1 Album       Tidal Pop W… NA     2028         Ocea… TRUE    17272 2027        
 2 Album       Salty Dreams NA     2030         Ocea… TRUE    17273 2029        
 3 Album       The Current… NA     2032         Ocea… TRUE    17274 2031        
 4 Album       Coral Beats  NA     2034         Ocea… TRUE    17275 2033        
 5 Album       Tides & Bal… NA     2036         Ocea… TRUE    17276 2035        
 6 Album       Oceanbound   NA     2038         Ocea… TRUE    17277 2037        
 7 Album       Echoes of t… NA     2040         Ocea… TRUE    17278 2039        
 8 Song        High Tide H… TRUE   2028         Ocea… FALSE   17279 <NA>        
 9 Song        Electric Ee… FALSE  2028         Ocea… TRUE    17280 <NA>        
10 Song        Sun-Drenche… FALSE  2028         Ocea… FALSE   17281 <NA>        
# ℹ 16 more rows
# ℹ 2 more variables: stage_name <chr>, notoriety_date <chr>
# Oceanus Folk Community works
oceanus_works_idx <- nodes_tbl %>%
  mutate(idx = row_number()) %>%
  filter(genre == "Oceanus Folk") %>%
  pull(idx)
# Combine all nodes
focus_idx <- unique(c(sailor_works_idx, oceanus_works_idx))
nodes_tbl[focus_idx, ]
# A tibble: 305 × 10
   `Node Type` name         single release_date genre notable    id written_date
   <chr>       <chr>        <lgl>  <chr>        <chr> <lgl>   <int> <chr>       
 1 Album       Tidal Pop W… NA     2028         Ocea… TRUE    17272 2027        
 2 Album       Salty Dreams NA     2030         Ocea… TRUE    17273 2029        
 3 Album       The Current… NA     2032         Ocea… TRUE    17274 2031        
 4 Album       Coral Beats  NA     2034         Ocea… TRUE    17275 2033        
 5 Album       Tides & Bal… NA     2036         Ocea… TRUE    17276 2035        
 6 Album       Oceanbound   NA     2038         Ocea… TRUE    17277 2037        
 7 Album       Echoes of t… NA     2040         Ocea… TRUE    17278 2039        
 8 Song        High Tide H… TRUE   2028         Ocea… FALSE   17279 <NA>        
 9 Song        Electric Ee… FALSE  2028         Ocea… TRUE    17280 <NA>        
10 Song        Sun-Drenche… FALSE  2028         Ocea… FALSE   17281 <NA>        
# ℹ 295 more rows
# ℹ 2 more variables: stage_name <chr>, notoriety_date <chr>
# Influence Types

influence_types3 <- c(
  "DirectlySamples",
  "InStyleOf",
  "LyricalReferenceTo",
  "InterpolatesFrom",
  "CoverOf"
)
# Filter Edges
graph_3 <- graph %>%
  activate(edges) %>%
  filter(`Edge Type` %in% influence_types3 )
# Extracting Nodes


used_node_indices3 <- graph_3 %>%
  activate(edges) %>%
  as_tibble() %>%
  select(from, to) %>%
  unlist() %>%
  unique()
# Keep Nodes

graph_3 <- graph_3 %>%
  activate(nodes) %>%
  mutate(row_id = row_number()) %>%
  filter(row_id %in% focus_idx) %>%
  select(-row_id)  # optional cleanup
# Add label
graph_3 <- graph_3 %>%
  activate(nodes) %>%
  mutate(is_sailor_work = ifelse(name %in% nodes_tbl$name[sailor_works_idx], 
                                 "Sailor's Work", "Other"))
# Ploting

ggraph(graph_3, layout = "fr") +
  geom_edge_link(alpha = 0.5, colour = "gray") +
  geom_node_point(aes(color = is_sailor_work), size = 1.5) +
  theme_void()

Sailor Shift has influenced collaborators in the Oceanus Folk community primarily through indirect inspiration. Her works, though few in number, are embedded across different parts of the network, suggesting they have been referenced or sampled by multiple creators. While she doesn’t appear to collaborate repeatedly with specific individuals, her influence spans across stylistic clusters, indicating a broad and decentralized artistic impact