Skip to content

ng_load

Load Data from CSV or Parquet file

It's supported to load data from a CSV or Parquet file into NebulaGraph with the help of ng_load magic.

Examples

For example, to load data from a CSV file actor.csv into a space basketballplayer with tag player and vid in column 0, and props in column 1 and 2:

player_id,name,age
"player999","Tom Hanks",30
"player1000","Tom Cruise",40
"player1001","Jimmy X",33

Then the %ng_load line would be:

%ng_load --header --source actor.csv --tag player --vid 0 --props 1:name,2:age --space basketballplayer
         ────┬─── ────┬───────────── ─────┬────── ───┬─── ─────────┬────────── ────────────┬───────────
             │        │                   │          │             │                       │           
             │        │                   │          │             │                       │           
             │        │                   │          │             │                       │           
             │        │                   │          │             │                       │           
             │        │                   │          │             │                       │           
             │        │                   │          │             │                       │           
             │        │  ┌────────────────┘          │             │                 ┌────────────────┐
             │        │  │                           │             │                 │Graph Space Name│
             │        │  │            ┌──────────────┘             │                 └────────────────┘
             │        │  │            │    ┌──────────────────────────────────────────────────────────┐
             │        │  │            │    │Properties on <column_index>:<prop_name> if there are any.│
             │        │  │            │    └──────────────────────────────────────────────────────────┘
             │        │  │   ┌────────┴───────────────────────────────────────────────────────────────┐
             │        │  │   │                              For tag, there will be column index of VID│
             │        │  │   │ For edge, there will be src/dst VID index, or optionally the rank index│
             │        │  │   └────────────────────────────────────────────────────────────────────────┘
             │        │  │                                                    ┌───────────────────────┐
             │        │  └────────────────────────────────────────────────────┤vertex tag or edge type│
             │        │                                                       └───────────────────────┘
             │        │                                                  ┌────────────────────────────┐
             │        └──────────────────────────────────────────────────┤File to parse, a path or URL│
             │                                                           └────────────────────────────┘
             │                                                         ┌──────────────────────────────┐
             └─────────────────────────────────────────────────────────┤With Header in Row:0, Optional│
                                                                       └──────────────────────────────┘

Some other examples:

# load CSV from a URL
%ng_load --source https://github.com/wey-gu/jupyter_nebulagraph/raw/main/examples/actor.csv --tag player --vid 0 --props 1:name,2:age --space demo_basketballplayer
# with rank column
%ng_load --source follow_with_rank.csv --edge follow --src 0 --dst 1 --props 2:degree --rank 3 --space basketballplayer
# without rank column
%ng_load --source follow.csv --edge follow --src 0 --dst 1 --props 2:degree --space basketballplayer

Usage

%ng_load --source <source> [--header] --space <space> [--tag <tag>] [--vid <vid>] [--edge <edge>] [--src <src>] [--dst <dst>] [--rank <rank>] [--props <props>] [-b <batch>] [--limit <limit>]

Arguments

Argument Requirement Description
--header Optional Indicates if the CSV file contains a header row. If this flag is set, the first row of the CSV will be treated as column headers.
-n, --space Required Specifies the name of the NebulaGraph space where the data will be loaded.
-s, --source Required The file path or URL to the CSV file. Supports both local paths and remote URLs.
-t, --tag Optional The tag name for vertices. Required if loading vertex data.
--vid Optional The column index for the vertex ID. Required if loading vertex data.
-e, --edge Optional The edge type name. Required if loading edge data.
--src Optional The column index for the source vertex ID when loading edges.
--dst Optional The column index for the destination vertex ID when loading edges.
--rank Optional The column index for the rank value of edges. Default is None.
--props Optional Comma-separated column indexes for mapping to properties. The format for mapping is column_index:property_name.
-b, --batch Optional Batch size for data loading. Default is 256.
--limit Optional The maximum number of rows to load. Default is -1(unlimited).