ng_load

Load Data from CSV or Parquet file

It's supported to load data from a CSV or Parquet file into NebulaGraph with the help of ng_load magic.

Examples

For example, to load data from a CSV file actor.csv into a space basketballplayer with tag player and vid in column 0, and props in column 1 and 2:

player_id,name,age
"player999","Tom Hanks",30
"player1000","Tom Cruise",40
"player1001","Jimmy X",33

Then the %ng_load line would be:

%ng_load --header --source actor.csv --tag player --vid 0 --props 1:name,2:age --space basketballplayer
         ────┬─── ────┬───────────── ─────┬────── ───┬─── ─────────┬────────── ────────────┬───────────
             │        │                   │          │             │                       │           
             │        │                   │          │             │                       │           
             │        │                   │          │             │                       │           
             │        │                   │          │             │                       │           
             │        │                   │          │             │                       │           
             │        │                   │          │             │                       │           
             │        │  ┌────────────────┘          │             │                 ┌────────────────┐
             │        │  │                           │             │                 │Graph Space Name│
             │        │  │            ┌──────────────┘             │                 └────────────────┘
             │        │  │            │    ┌──────────────────────────────────────────────────────────┐
             │        │  │            │    │Properties on <column_index>:<prop_name> if there are any.│
             │        │  │            │    └──────────────────────────────────────────────────────────┘
             │        │  │   ┌────────┴───────────────────────────────────────────────────────────────┐
             │        │  │   │                              For tag, there will be column index of VID│
             │        │  │   │ For edge, there will be src/dst VID index, or optionally the rank index│
             │        │  │   └────────────────────────────────────────────────────────────────────────┘
             │        │  │                                                    ┌───────────────────────┐
             │        │  └────────────────────────────────────────────────────┤vertex tag or edge type│
             │        │                                                       └───────────────────────┘
             │        │                                                  ┌────────────────────────────┐
             │        └──────────────────────────────────────────────────┤File to parse, a path or URL│
             │                                                           └────────────────────────────┘
             │                                                         ┌──────────────────────────────┐
             └─────────────────────────────────────────────────────────┤With Header in Row:0, Optional│
                                                                       └──────────────────────────────┘

Some other examples:

# load CSV from a URL
%ng_load --source https://github.com/wey-gu/jupyter_nebulagraph/raw/main/examples/actor.csv --tag player --vid 0 --props 1:name,2:age --space demo_basketballplayer
# with rank column
%ng_load --source follow_with_rank.csv --edge follow --src 0 --dst 1 --props 2:degree --rank 3 --space basketballplayer
# without rank column
%ng_load --source follow.csv --edge follow --src 0 --dst 1 --props 2:degree --space basketballplayer

Usage

%ng_load --source <source> [--header] --space <space> [--tag <tag>] [--vid <vid>] [--edge <edge>] [--src <src>] [--dst <dst>] [--rank <rank>] [--props <props>] [-b <batch>] [--limit <limit>]

Arguments

Argument	Requirement	Description
`--header`	Optional	Indicates if the CSV file contains a header row. If this flag is set, the first row of the CSV will be treated as column headers.
`-n`, `--space`	Required	Specifies the name of the NebulaGraph space where the data will be loaded.
`-s`, `--source`	Required	The file path or URL to the CSV file. Supports both local paths and remote URLs.
`-t`, `--tag`	Optional	The tag name for vertices. Required if loading vertex data.
`--vid`	Optional	The column index for the vertex ID. Required if loading vertex data.
`-e`, `--edge`	Optional	The edge type name. Required if loading edge data.
`--src`	Optional	The column index for the source vertex ID when loading edges.
`--dst`	Optional	The column index for the destination vertex ID when loading edges.
`--rank`	Optional	The column index for the rank value of edges. Default is None.
`--props`	Optional	Comma-separated column indexes for mapping to properties. The format for mapping is column_index:property_name.
`-b`, `--batch`	Optional	Batch size for data loading. Default is 256.
`--limit`	Optional	The maximum number of rows to load. Default is -1(unlimited).