Get Started
This guide will help you walk through end to end process of tweaking NebulaGraph within Jupyter Notebook.
Prerequirements¶
We need to have a running NebulaGraph Cluster. If you don't have one, you could leverage NebulaGraph-Lite to do spawn an ad-hoc cluster, for more options please refer to NebulaGraph Docs.
See also here for more NebulaGraph installation options: NebulaGraph Installation Options from jupyter-nebulagraph documentation.
%pip install nebulagraph-lite
from nebulagraph_lite import nebulagraph_let as ng_let
n = ng_let()
# This takes around 5 mins
n.start()
Streaming output truncated to the last 5000 lines. ...var/lib/rpm/__db.002 var/lib/rpm/__db.003 2ccae830-d547-3d25-8d86-c8e24b20d62e Debug: using curl executable Debug: Localrepo homedir is /home/user/.udocker Debug: using curl executable Debug: already installed, installation skipped SHOW TAGS: ResultSet(None) Info: downloading layer sha256:73dde089847b2e7be0b3e12a438aa50dab9d29587a3f37512daf74271d6f7eb1 Info: downloading layer sha256:8a5d5aed99ca3dd1343afcb47ea7136fa6350a36a31c12ebc62a6a92ddf1c7ef Info: downloading layer sha256:19bc7f3f0d802b7e8dc89786cfa0e18ab81bc501d90e1d24720d470e4e213c03 Info: downloading layer sha256:7264a8db6415046d36d16ba98b79778e18accee6ffa71850405994cffa9be7de Info: loading basketballplayer dataset... _ _ _ _ ____ _ | \ | | ___| |__ _ _| | __ _ / ___|_ __ __ _ _ __ | |__ | \| |/ _ | '_ \| | | | |/ _` | | _| '__/ _` | '_ \| '_ \ | |\ | __| |_) | |_| | | (_| | |_| | | | (_| | |_) | | | | |_| \_|\___|_.__/ \__,_|_|\__,_|\____|_| \__,_| .__/|_| |_| |_| lite version [ OK ] nebulagraph_lite started successfully! CONTAINER ID P M NAMES IMAGE cac33b42-6851-316a-b32c-96641cb38492 . W ['nebula-metad'] vesoft/nebula-metad:v3 2ccae830-d547-3d25-8d86-c8e24b20d62e . W ['nebula-storaged'] vesoft/nebula-storaged:v3 2887f6d9-a872-3e17-b5cf-8ef4a7eb6e37 . W ['nebula-graphd'] vesoft/nebula-graphd:v3
Installation¶
First, install with pip:
%pip install jupyter_nebulagraph
Second, load extension:
$load_ext ngql
%pip install jupyter_nebulagraph
%load_ext ngql
Connect to NebulaGraph¶
With:
%ngql --address <ip> --port <port> --user <username> --password <password>
By default, spaces of the cluster will be printed.
%ngql --address 127.0.0.1 --port 9669 --user root --password nebula
Connection Pool Created
Name | |
---|---|
0 | demo_basketballplayer |
1 | freebase_15k |
2 | nba |
3 | news |
%ngql USE demo_basketballplayer;
%ngql MATCH (v:player{name:"Tim Duncan"})-->(v2:player) RETURN v2.player.name AS Name;
Name | |
---|---|
0 | Tony Parker |
1 | Manu Ginobili |
Multiline Query %%ngql
¶
Option 2, to perform multiple queries in one go.
ngql
%%ngql
<line 0>;
<line 1>;
%%ngql
USE demo_basketballplayer;
SUBMIT JOB STATS;
SHOW STATS;
Type | Name | Count | |
---|---|---|---|
0 | Tag | player | 54 |
1 | Tag | team | 30 |
2 | Edge | follow | 82 |
3 | Edge | serve | 146 |
4 | Space | vertices | 84 |
5 | Space | edges | 228 |
Cheatsheet¶
The only takeout should be:
You could always get help from %ngql help
for some details of supported magics && examples you could copy from.
ngql
%ngql help
Using Variables in Query String¶
We used Jinja2(https://jinja.palletsprojects.com/) as templating method for variables in query string:
trainer = "Sue"
ngql
%%ngql
GO FROM "{{ trainer }}" OVER owns_pokemon YIELD owns_pokemon._dst as pokemon_id | GO FROM $-.pokemon_id OVER owns_pokemon REVERSELY YIELD owns_pokemon._dst AS Trainer_Name;
vid = "player100"
%%ngql
MATCH (v)<-[e:follow]- (v2)-[e2:serve]->(v3)
WHERE id(v) == "{{ vid }}"
RETURN v2.player.name AS FriendOf, v3.team.name AS Team LIMIT 3;
FriendOf | Team | |
---|---|---|
0 | Boris Diaw | Spurs |
1 | Boris Diaw | Jazz |
2 | Boris Diaw | Suns |
df = _
df
FriendOf | Team | |
---|---|---|
0 | Boris Diaw | Spurs |
1 | Boris Diaw | Jazz |
2 | Boris Diaw | Suns |
Tweaking Raw Result(Optional)¶
By default the result ngql_result_style
is pandas
, this enabled us to have a table view rendered by Jupyter Notebook.
While, if you would like to get raw results from neutron3-python
itself, just configure it as below on the fly:
%config IPythonNGQL.ngql_result_style="raw"
And after querying, the result will be stored in _
, plesae then refer it to a new variable for further ad-hoc tweaking on it like:
$ngql <query>;
result = _
dir(result)
%config IPythonNGQL.ngql_result_style="raw"
%%ngql
USE demo_basketballplayer;
GO 2 STEPS FROM "player102" OVER follow YIELD dst(edge);
ResultSet(keys: ['dst(EDGE)'], values: ["player100"],["player102"],["player125"],["player101"],["player125"])
r = _
r.column_values("dst(EDGE)")[0].cast()
'player100'
Now we change back to pandas
ngql_result_style
%config IPythonNGQL.ngql_result_style="pandas"
%%ngql
GO FROM "player100", "player102" OVER serve
WHERE properties(edge).start_year > 1995
YIELD DISTINCT properties($$).name AS team_name, properties(edge).start_year AS start_year, properties($^).name AS player_name;
team_name | start_year | player_name | |
---|---|---|---|
0 | Spurs | 1997 | Tim Duncan |
1 | Trail Blazers | 2006 | LaMarcus Aldridge |
2 | Spurs | 2015 | LaMarcus Aldridge |
Load Data from CSV¶
Since 0.9.0, it is supported to load data into NebulaGraph with ease.
We could load data from a local path or a URL:
%ng_load --source https://github.com/wey-gu/ipython-ngql/raw/main/examples/actor.csv --tag player --vid 0 --props 1:name,2:age --space demo_basketballplayer
Parsed 3 vertices 'demo_basketballplayer' for tag 'player' in memory
Loading Vertices: 0%| | 0/1 [00:00<?, ?it/s]
Loaded 3 of 3 vertices Successfully loaded 3 vertices 'demo_basketballplayer' for tag 'player'
%ng_load
docs¶
The %ng_load magic command is designed to facilitate the loading of data from CSV files into NebulaGraph as vertices or edges. This command streamlines the process of importing data directly within a Jupyter Notebook environment, making it easier for users to work with NebulaGraph databases.
Usage¶
%ng_load --source <source> [--header] --space <space> [--tag <tag>] [--vid <vid>] [--edge <edge>] [--src <src>] [--dst <dst>] [--rank <rank>] [--props <props>] [-b <batch>]
Arguments¶
--header: (Optional) Indicates if the CSV file contains a header row. If this flag is set, the first row of the CSV will be treated as column headers.
-n, --space (Required): Specifies the name of the NebulaGraph space where the data will be loaded.
-s, --source (Required): The file path or URL to the CSV file. Supports both local paths and remote URLs.
-t, --tag: The tag name for vertices. Required if loading vertex data.
--vid: The column index for the vertex ID. Required if loading vertex data.
-e, --edge: The edge type name. Required if loading edge data.
--src: The column index for the source vertex ID when loading edges.
--dst: The column index for the destination vertex ID when loading edges.
--rank: (Optional) The column index for the rank value of edges. Default is None.
--props: (Optional) Comma-separated column indexes for mapping to properties. The format for mapping is column_index:property_name.
-b, --batch (Optional): Batch size for data loading. Default is 256.
Examples¶
Loading Vertices
To load vertex data from a local CSV file named actor.csv into the basketballplayer space with the player tag, where the vertex ID is in the first column, and the properties name and age are in the second and third columns, respectively:
%ng_load --source actor.csv --tag player --vid 0 --props 1:name,2:age --space demo_basketballplayer
Parsed 3 vertices 'demo_basketballplayer' for tag 'player' in memory
Loading Vertices: 0%| | 0/1 [00:00<?, ?it/s]
Loaded 3 of 3 vertices Successfully loaded 3 vertices 'demo_basketballplayer' for tag 'player'
Loading Edges
To load edge data from a local CSV file named follow_with_rank.csv into the basketballplayer space with the follow edge type, where the source vertex ID is in the first column, the destination vertex ID is in the second column, the property degree is in the third column, and the rank is in the fourth column:
%ng_load --source follow_with_rank.csv --edge follow --src 0 --dst 1 --props 2:degree --rank 3 --space demo_basketballplayer
Parsed 1 edges 'demo_basketballplayer' for edge type 'follow' in memory
Loading Edges: 0%| | 0/1 [00:00<?, ?it/s]
Loaded 1 of 1 edges Successfully loaded 1 edges 'demo_basketballplayer' for edge type 'follow'
Draw nGQL queries %ng_draw
¶
We could render Graphs with %ng_draw
thanks to the upstream project pyvis
.
%pip install pyvis
%ngql match p=(:player)-[]->() return p LIMIT 5
p | |
---|---|
0 | ("player148" :player{age: 45, name: "Jason Kid... |
1 | ("player148" :player{age: 45, name: "Jason Kid... |
2 | ("player148" :player{age: 45, name: "Jason Kid... |
3 | ("player148" :player{age: 45, name: "Jason Kid... |
4 | ("player148" :player{age: 45, name: "Jason Kid... |
##uncomment to draw
# %ng_draw
%ngql GET SUBGRAPH 2 STEPS FROM "player101" YIELD VERTICES AS nodes, EDGES AS relationships;
nodes | relationships | |
---|---|---|
0 | [("player101" :player{})] | [("player101")-[:serve@0{}]->("team204"), ("pl... |
1 | [("player102" :player{}), ("player100" :player... | [("player102")-[:serve@0{}]->("team203"), ("pl... |
2 | [("player144" :player{}), ("player112" :player... | [("player144")-[:serve@0{}]->("team214"), ("pl... |
%ng_draw
<class 'pyvis.network.Network'> |N|=36 |E|=84
Draw Graph Schema %ng_draw_schema
¶
Also, we could quickly draw the schema with %ng_draw_schema
, which samples all types of edges to show us what the graph looks like.
This example comes from a dataset/space called demo_supplychain, to get those datasets named demo_*
, you could install NebulaGraph Studio and click Download to have them ingested into NebulaGraph in one minute.
%ngql CREATE SPACE demo_supplychain(partition_num=1, replica_factor=1, vid_type=fixed_string(128));
!sleep 10
%ngql USE demo_supplychain
%%ngql
CREATE TAG IF NOT EXISTS car_model(name string, number string, year int, type string, engine_type string, size string, seats int);
CREATE TAG IF NOT EXISTS feature(name string, number string, type string, state string);
CREATE TAG IF NOT EXISTS `part`(name string, number string, price double, `date` string);
CREATE TAG IF NOT EXISTS supplier(name string, address string, contact string, phone_number string);
CREATE EDGE IF NOT EXISTS with_feature(version string);
CREATE EDGE IF NOT EXISTS is_composed_of(version string);
CREATE EDGE IF NOT EXISTS is_supplied_by(version string);
!sleep 10
%%ngql
INSERT VERTEX `car_model`(`name`, `number`, `year`, `type`, `engine_type`, `size`, `seats`) VALUES "m_1":("Model A", "001", 2023, "Sedan", "Gasoline", "Compact", 4), "m_2":("Model B", "002", 2023, "Coupe", "Electric", "Compact", 2), "m_3":("Model C", "003", 2022, "SUV", "Hybrid", "Large", 7), "m_4":("Model D", "004", 2022, "Truck", "Diesel", "Extra Large", 5), "m_5":("Model E", "005", 2021, "Sedan", "Electric", "Medium", 5), "m_6":("Model F", "006", 2021, "Convertible", "Gasoline", "Compact", 2), "m_7":("Model G", "007", 2023, "Crossover", "Hybrid", "Medium", 5), "m_8":("Model H", "008", 2020, "Hatchback", "Electric", "Compact", 4), "m_9":("Model I", "009", 2022, "Sedan", "Gasoline", "Large", 5), "m_10":("Model J", "010", 2021, "SUV", "Hybrid", "Extra Large", 7);
INSERT VERTEX `supplier`(`name`, `address`, `contact`, `phone_number`) VALUES "s_31":("Supplier A", "123 Street", "John Doe", "1234567890"), "s_32":("Supplier B", "456 Avenue", "Emily Smith", "0987654321"), "s_33":("Supplier C", "789 Boulevard", "Robert Brown", "1112233445"), "s_34":("Supplier D", "101 Place", "Maria Johnson", "2223344556"), "s_35":("Supplier E", "202 Drive", "Michael Williams", "3334455667"), "s_36":("Supplier F", "303 Lane", "Susan Miller", "4445566778"), "s_37":("Supplier G", "404 Road", "Chris Lee", "5556677889"), "s_38":("Supplier H", "505 Street", "Jane Wilson", "6667788990"), "s_39":("Supplier I", "606 Way", "Brian Anderson", "7778899001"), "s_40":("Supplier J", "707 Avenue", "Linda Hall", "8889900112");
INSERT VERTEX `feature`(`name`, `number`, `type`, `state`) VALUES "f_11":("Sunroof", "F001", "Optional", "Available"), "f_12":("Bluetooth", "F002", "Standard", "Available"), "f_13":("Navigation", "F003", "Optional", "N/A"), "f_14":("Heated Seats", "F004", "Standard", "Available"), "f_15":("Backup Camera", "F005", "Optional", "Available"), "f_16":("Leather Seats", "F006", "Standard", "Available"), "f_17":("Adaptive Cruise", "F007", "Optional", "Available"), "f_18":("Blind Spot Monitor", "F008", "Standard", "Available"), "f_19":("Remote Start", "F009", "Optional", "N/A"), "f_20":("Apple CarPlay", "F010", "Standard", "Available");
INSERT VERTEX `part`(`name`, `number`, `price`, `date`) VALUES "p_21":("Brake Pad", "P001", 50, "2023-01-01"), "p_22":("Engine", "P002", 2000, "2023-05-03"), "p_23":("Tire", "P003", 100, "2022-08-14"), "p_24":("Transmission", "P004", 1500, "2022-02-20"), "p_25":("Radiator", "P005", 250, "2022-06-15"), "p_26":("Window Glass", "P006", 60, "2021-11-23"), "p_27":("Battery", "P007", 120, "2023-03-09"), "p_28":("Headlight", "P008", 90, "2023-07-30"), "p_29":("Alternator", "P009", 180, "2022-09-04"), "p_30":("Air Filter", "P010", 20, "2023-04-22");
INSERT EDGE `with_feature`(`version`) VALUES "m_1"->"f_12":("1.0"), "m_2"->"f_13":("1.0"), "m_3"->"f_14":("1.1"), "m_4"->"f_15":("1.2"), "m_5"->"f_11":("1.0"), "m_6"->"f_12":("1.0"), "m_7"->"f_13":("1.0"), "m_8"->"f_14":("1.0"), "m_9"->"f_15":("1.0"), "m_10"->"f_11":("1.0"), "m_2"->"f_12":("1.0"), "m_3"->"f_13":("1.1"), "m_4"->"f_14":("1.2"), "m_5"->"f_15":("1.0"), "m_6"->"f_11":("1.0"), "m_7"->"f_12":("1.0"), "m_8"->"f_13":("1.0"), "m_9"->"f_14":("1.0"), "m_10"->"f_15":("1.0"), "m_1"->"f_11":("1.0"), "m_2"->"f_12":("1.2"), "m_3"->"f_13":("1.1"), "m_4"->"f_12":("1.0"), "m_5"->"f_15":("1.3"), "m_6"->"f_11":("1.2"), "m_7"->"f_14":("1.0"), "m_8"->"f_13":("1.1"), "m_9"->"f_15":("1.2"), "m_10"->"f_12":("1.1"), "m_1"->"f_13":("1.3"), "m_2"->"f_14":("1.0"), "m_3"->"f_11":("1.1"), "m_4"->"f_14":("1.0"), "m_5"->"f_15":("1.2"), "m_6"->"f_13":("1.0"), "m_7"->"f_12":("1.1"), "m_8"->"f_15":("1.1"), "m_9"->"f_11":("1.2"), "m_10"->"f_14":("1.3"), "m_2"->"f_11":("1.0"), "m_3"->"f_12":("1.1"), "m_5"->"f_14":("1.0"), "m_6"->"f_15":("1.1"), "m_8"->"f_12":("1.2"), "m_9"->"f_13":("1.0"), "m_1"->"f_15":("1.2"), "m_7"->"f_13":("1.3"), "m_4"->"f_11":("1.0"), "m_10"->"f_15":("1.1");
INSERT EDGE `is_composed_of`(`version`) VALUES "f_11"->"p_21":("1.0"), "f_12"->"p_22":("1.0"), "f_13"->"p_23":("1.1"), "f_14"->"p_24":("1.2"), "f_15"->"p_21":("1.0"), "f_16"->"p_22":("1.0"), "f_17"->"p_23":("1.0"), "f_18"->"p_24":("1.0"), "f_19"->"p_25":("1.0"), "f_20"->"p_26":("1.0"), "f_11"->"p_27":("1.0"), "f_12"->"p_28":("1.0"), "f_13"->"p_29":("1.1"), "f_14"->"p_30":("1.2"), "f_15"->"p_21":("1.0"), "f_16"->"p_22":("1.0"), "f_17"->"p_23":("1.0"), "f_18"->"p_24":("1.0"), "f_19"->"p_25":("1.0"), "f_20"->"p_26":("1.0");
INSERT EDGE `is_supplied_by`(`version`) VALUES "p_21"->"s_31":("1.0"), "p_22"->"s_32":("1.0"), "p_23"->"s_33":("1.1"), "p_24"->"s_34":("1.2"), "p_25"->"s_35":("1.0"), "p_26"->"s_36":("1.0"), "p_27"->"s_37":("1.0"), "p_28"->"s_38":("1.0"), "p_29"->"s_39":("1.1"), "p_30"->"s_40":("1.2"), "p_21"->"s_31":("1.0"), "p_22"->"s_32":("1.0"), "p_23"->"s_33":("1.1"), "p_24"->"s_34":("1.2"), "p_25"->"s_35":("1.0"), "p_26"->"s_36":("1.0"), "p_27"->"s_37":("1.0"), "p_28"->"s_38":("1.0"), "p_29"->"s_39":("1.1"), "p_30"->"s_40":("1.2");
%ng_draw_schema
<class 'pyvis.network.Network'> |N|=4 |E|=3
% ngql help
!¶
Again, all you have to remember is to use $ngql help
to have all hints :-)
%ngql help
Supported Configurations: ------------------------ > How to config ngql_result_style in "raw", "pandas" %config IPythonNGQL.ngql_result_style="raw" %config IPythonNGQL.ngql_result_style="pandas" > How to config ngql_verbose in True, False %config IPythonNGQL.ngql_verbose=True > How to config max_connection_pool_size %config IPythonNGQL.max_connection_pool_size=10 Quick Start: ----------- > Connect to Neubla Graph %ngql --address 127.0.0.1 --port 9669 --user user --password password > Use Space %ngql USE basketballplayer > Query %ngql SHOW TAGS; > Multile Queries %%ngql SHOW TAGS; SHOW HOSTS; Reload ngql Magic %reload_ext ngql > Variables in query, we are using Jinja2 here name = "nba" %ngql USE "{{ name }}" > Query and draw the graph %ngql GET SUBGRAPH 2 STEPS FROM "player101" YIELD VERTICES AS nodes, EDGES AS relationships; %ng_draw > Query and draw the graph schema %ng_draw_schema > Load data from CSV file into NebulaGraph as vertices or edges %ng_load --source actor.csv --tag player --vid 0 --props 1:name,2:age --space basketballplayer #actor.csv "player999","Tom Hanks",30 "player1000","Tom Cruise",40 %ng_load --source follow_with_rank.csv --edge follow --src 0 --dst 1 --props 2:degree --rank 3 --space basketballplayer #follow_with_rank.csv "player999","player1000",50,1 %ng_load --source follow.csv --edge follow --src 0 --dst 1 --props 2:degree --space basketballplayer #follow.csv "player999","player1000",50 %ng_load --source https://github.com/wey-gu/ipython-ngql/raw/main/examples/actor.csv --tag player --vid 0 --props 1:name,2:age --space demo_basketballplayer -b 2