TDengine Python Connector
taospy
is the official Python connector for TDengine. taospy
provides a rich set of APIs that makes it easy for Python applications to access TDengine. taospy
wraps both the native interface and REST interface of TDengine, which correspond to the taos
and taosrest
modules of the taospy
package, respectively.
In addition to wrapping the native and REST interfaces, taospy
also provides a set of programming interfaces that conforms to the Python Data Access Specification (PEP 249). It is easy to integrate taospy
with many third-party tools, such as SQLAlchemy and pandas.
The direct connection to the server using the native interface provided by the client driver is referred to hereinafter as a "native connection"; the connection to the server using the REST interface provided by taosAdapter is referred to hereinafter as a "REST connection".
The source code for the Python connector is hosted on GitHub.
Supported Platforms
- The supported platforms for the native connection are the same as the ones supported by the TDengine client.
- REST connections are supported on all platforms that can run Python.
Version selection
We recommend using the latest version of taospy
, regardless of the version of TDengine.
Supported features
- Native connections support all the core features of TDengine, including connection management, SQL execution, bind interface, subscriptions, and schemaless writing.
- REST connections support features such as connection management and SQL execution. (SQL execution allows you to: manage databases, tables, and supertables, write data, query data, create continuous queries, etc.).
Installation
Preparation
- Install Python. Python >= 3.6 is recommended. If Python is not available on your system, refer to the Python BeginnersGuide to install it.
- Install pip. In most cases, the Python installer comes with the pip utility. If not, please refer to pip documentation to install it.
If you use a native connection, you will also need to Install Client Driver. The client install package includes the TDengine client dynamic link library (libtaos.so
or taos.dll
) and the TDengine CLI.
Install via pip
Uninstalling an older version
If you have installed an older version of the Python Connector, please uninstall it beforehand.
pip3 uninstall taos taospy
Earlier TDengine client software includes the Python connector. If the Python connector is installed from the client package's installation directory, the corresponding Python package name is taos
. So the above uninstall command includes taos
, and it doesn't matter if it doesn't exist.
To install taospy
- Install from PyPI
- Install from GitHub
Install the latest version of:
pip3 install taospy
You can also specify a specific version to install:
pip3 install taospy==2.3.0
pip3 install git+https://github.com/taosdata/taos-connector-python.git
Installation verification
- native connection
- REST connection
For native connection, you need to verify that both the client driver and the Python connector itself are installed correctly. The client driver and Python connector have been installed properly if you can successfully import the taos
module. In the Python Interactive Shell, you can type.
import taos
For REST connections, verifying that the taosrest
module can be imported successfully can be done in the Python Interactive Shell by typing.
import taosrest
If you have multiple versions of Python on your system, you may have various pip
commands. Be sure to use the correct path for the pip
command. Above, we installed the pip3
command, which rules out the possibility of using the pip
corresponding to Python 2.x versions. However, if you have more than one version of Python 3.x on your system, you still need to check that the installation path is correct. The easiest way to verify this is to type pip3 install taospy
again in the command, and it will print out the exact location of taospy
, for example, on Windows.
C:\> pip3 install taospy
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: taospy in c:\users\username\appdata\local\programs\python\python310\lib\site-packages (2.3.0)
Establish connection
Connectivity testing
Before establishing a connection with the connector, we recommend testing the connectivity of the local TDengine CLI to the TDengine cluster.
- native connection
- REST connection
Ensure that the TDengine instance is up and that the FQDN of the machines in the cluster (the FQDN defaults to hostname if you are starting a standalone version) can be resolved locally, by testing with the ping
command.
ping <FQDN>
Then test if the cluster can be appropriately connected with TDengine CLI:
taos -h <FQDN> -p <PORT>
The FQDN above can be the FQDN of any dnode in the cluster, and the PORT is the serverPort corresponding to this dnode.
For REST connections, make sure the cluster and taosAdapter component, are running. This can be tested using the following curl
command.
curl -u root:taosdata http://<FQDN>:<PORT>/rest/sql -d "select server_version()"
The FQDN above is the FQDN of the machine running taosAdapter, PORT is the port taosAdapter listening, default is 6041
.
If the test is successful, it will output the server version information, e.g.
{
"status": "succ",
"head": ["server_version()"],
"column_meta": [["server_version()", 8, 8]],
"data": [["2.4.0.16"]],
"rows": 1
}
Using connectors to establish connections
The following example code assumes that TDengine is installed locally and that the default configuration is used for both FQDN and serverPort.
- native connection
- REST connection
import taos
conn: taos.TaosConnection = taos.connect(host="localhost",
user="root",
password="taosdata",
database="test",
port=6030,
config="/etc/taos", # for windows the default value is C:\TDengine\cfg
timezone="Asia/Shanghai") # default your host's timezone
server_version = conn.server_info
print("server_version", server_version)
client_version = conn.client_info
print("client_version", client_version) # 2.4.0.16
conn.close()
# possible output:
# 2.4.0.16
# 2.4.0.16
All arguments of the connect()
function are optional keyword arguments. The following are the connection parameters specified.
host
: The FQDN of the node to connect to. There is no default value. If this parameter is not provided, the firstEP in the client configuration file will be connected.user
: The TDengine user name. The default value isroot
.password
: TDengine user password. The default value istaosdata
.port
: The starting port of the data node to connect to, i.e., the serverPort configuration. The default value is 6030, which will only take effect if the host parameter is provided.config
: The path to the client configuration file. On Windows systems, the default isC:\TDengine\cfg
. The default is/etc/taos/
on Linux systems.timezone
: The timezone used to convert the TIMESTAMP data in the query results to pythondatetime
objects. The default is the local timezone.
config
and timezone
are both process-level configurations. We recommend that all connections made by a process use the same parameter values. Otherwise, unpredictable errors may occur.
The connect()
function returns a taos.TaosConnection
instance. In client-side multi-threaded scenarios, we recommend that each thread request a separate connection instance rather than sharing a connection between multiple threads.
from taosrest import connect, TaosRestConnection, TaosRestCursor
conn: TaosRestConnection = connect(url="http://localhost:6041",
user="root",
password="taosdata",
timeout=30)
All arguments to the connect()
function are optional keyword arguments. The following are the connection parameters specified.
url
: The URL of taosAdapter REST service. The default ishttp://localhost:6041
.user
: TDengine user name. The default isroot
.password
: TDengine user password. The default istaosdata
.timeout
: HTTP request timeout in seconds. The default issocket._GLOBAL_DEFAULT_TIMEOUT
. Usually, no configuration is needed.
Sample program
Basic Usage
- native connection
- REST connection
TaosConnection class
The TaosConnection
class contains both an implementation of the PEP249 Connection interface (e.g., the cursor()
method and the close()
method) and many extensions (e.g., the execute()
, query()
, schemaless_insert()
, and subscribe()
methods).
conn = taos.connect()
# Execute a sql, ignore the result set, just get affected rows. It's useful for DDL and DML statement.
conn.execute("DROP DATABASE IF EXISTS test")
conn.execute("CREATE DATABASE test")
# change database. same as execute "USE db"
conn.select_db("test")
conn.execute("CREATE STABLE weather(ts TIMESTAMP, temperature FLOAT) TAGS (location INT)")
affected_row: int = conn.execute("INSERT INTO t1 USING weather TAGS(1) VALUES (now, 23.5) (now+1m, 23.5) (now+2m 24.4)")
print("affected_row", affected_row)
# output:
# affected_row 3
# Execute a sql and get its result set. It's useful for SELECT statement
result: taos.TaosResult = conn.query("SELECT * from weather")
# Get fields from result
fields: taos.field.TaosFields = result.fields
for field in fields:
print(field) # {name: ts, type: 9, bytes: 8}
# output:
# {name: ts, type: 9, bytes: 8}
# {name: temperature, type: 6, bytes: 4}
# {name: location, type: 4, bytes: 4}
# Get data from result as list of tuple
data = result.fetch_all()
print(data)
# output:
# [(datetime.datetime(2022, 4, 27, 9, 4, 25, 367000), 23.5, 1), (datetime.datetime(2022, 4, 27, 9, 5, 25, 367000), 23.5, 1), (datetime.datetime(2022, 4, 27, 9, 6, 25, 367000), 24.399999618530273, 1)]
# Or get data from result as a list of dict
# map_data = result.fetch_all_into_dict()
# print(map_data)
# output:
# [{'ts': datetime.datetime(2022, 4, 27, 9, 1, 15, 343000), 'temperature': 23.5, 'location': 1}, {'ts': datetime.datetime(2022, 4, 27, 9, 2, 15, 343000), 'temperature': 23.5, 'location': 1}, {'ts': datetime.datetime(2022, 4, 27, 9, 3, 15, 343000), 'temperature': 24.399999618530273, 'location': 1}]
The queried results can only be fetched once. For example, only one of fetch_all()
and fetch_all_into_dict()
can be used in the example above. Repeated fetches will result in an empty list.
Use of TaosResult class
In the above example of using the TaosConnection
class, we have shown two ways to get the result of a query: fetch_all()
and fetch_all_into_dict()
. In addition, TaosResult
also provides methods to iterate through the result set by rows (rows_iter
) or by data blocks (blocks_iter
). Using these two methods will be more efficient in scenarios where the query has a large amount of data.
import taos
conn = taos.connect()
conn.execute("DROP DATABASE IF EXISTS test")
conn.execute("CREATE DATABASE test")
conn.select_db("test")
conn.execute("CREATE STABLE weather(ts TIMESTAMP, temperature FLOAT) TAGS (location INT)")
# prepare data
for i in range(2000):
location = str(i % 10)
tb = "t" + location
conn.execute(f"INSERT INTO {tb} USING weather TAGS({location}) VALUES (now+{i}a, 23.5) (now+{i + 1}a, 23.5)")
result: taos.TaosResult = conn.query("SELECT * FROM weather")
block_index = 0
blocks: taos.TaosBlocks = result.blocks_iter()
for rows, length in blocks:
print("block ", block_index, " length", length)
print("first row in this block:", rows[0])
block_index += 1
conn.close()
# possible output:
# block 0 length 1200
# first row in this block: (datetime.datetime(2022, 4, 27, 15, 14, 52, 46000), 23.5, 0)
# block 1 length 1200
# first row in this block: (datetime.datetime(2022, 4, 27, 15, 14, 52, 76000), 23.5, 3)
# block 2 length 1200
# first row in this block: (datetime.datetime(2022, 4, 27, 15, 14, 52, 99000), 23.5, 6)
# block 3 length 400
# first row in this block: (datetime.datetime(2022, 4, 27, 15, 14, 52, 122000), 23.5, 9)
Use of the TaosCursor class
The TaosConnection
class and the TaosResult
class already implement all the functionality of the native interface. If you are familiar with the interfaces in the PEP249 specification, you can also use the methods provided by the TaosCursor
class.
import taos
conn = taos.connect()
cursor = conn.cursor()
cursor.execute("DROP DATABASE IF EXISTS test")
cursor.execute("CREATE DATABASE test")
cursor.execute("USE test")
cursor.execute("CREATE STABLE weather(ts TIMESTAMP, temperature FLOAT) TAGS (location INT)")
for i in range(1000):
location = str(i % 10)
tb = "t" + location
cursor.execute(f"INSERT INTO {tb} USING weather TAGS({location}) VALUES (now+{i}a, 23.5) (now+{i + 1}a, 23.5)")
cursor.execute("SELECT count(*) FROM weather")
data = cursor.fetchall()
print("count:", data[0][0])
cursor.execute("SELECT tbname, * FROM weather LIMIT 2")
col_names = [meta[0] for meta in cursor.description]
print(col_names)
rows = cursor.fetchall()
print(rows)
cursor.close()
conn.close()
# output:
# count: 2000
# ['tbname', 'ts', 'temperature', 'location']
# row_count: -1
# [('t0', datetime.datetime(2022, 4, 27, 14, 54, 24, 392000), 23.5, 0), ('t0', datetime.datetime(2022, 4, 27, 14, 54, 24, 393000), 23.5, 0)]
The TaosCursor class uses native connections for write and query operations. In a client-side multi-threaded scenario, this cursor instance must remain thread exclusive and cannot be shared across threads for use, otherwise, it will result in errors in the returned results.
Use of TaosRestCursor class
The TaosRestCursor
class is an implementation of the PEP249 Cursor interface.
# create STable
cursor: TaosRestCursor = conn.cursor()
cursor.execute("DROP DATABASE IF EXISTS power")
cursor.execute("CREATE DATABASE power")
cursor.execute("CREATE STABLE power.meters (ts TIMESTAMP, current FLOAT, voltage INT, phase FLOAT) TAGS (location BINARY(64), groupId INT)")
# insert data
cursor.execute("""INSERT INTO power.d1001 USING power.meters TAGS(California.SanFrancisco, 2) VALUES ('2018-10-03 14:38:05.000', 10.30000, 219, 0.31000) ('2018-10-03 14:38:15.000', 12.60000, 218, 0.33000) ('2018-10-03 14:38:16.800', 12.30000, 221, 0.31000)
power.d1002 USING power.meters TAGS(California.SanFrancisco, 3) VALUES ('2018-10-03 14:38:16.650', 10.30000, 218, 0.25000)
power.d1003 USING power.meters TAGS(California.LosAngeles, 2) VALUES ('2018-10-03 14:38:05.500', 11.80000, 221, 0.28000) ('2018-10-03 14:38:16.600', 13.40000, 223, 0.29000)
power.d1004 USING power.meters TAGS(California.LosAngeles, 3) VALUES ('2018-10-03 14:38:05.000', 10.80000, 223, 0.29000) ('2018-10-03 14:38:06.500', 11.50000, 221, 0.35000)""")
print("inserted row count:", cursor.rowcount)
# query data
cursor.execute("SELECT * FROM power.meters LIMIT 3")
# get total rows
print("queried row count:", cursor.rowcount)
# get column names from cursor
column_names = [meta[0] for meta in cursor.description]
# get rows
data: list[tuple] = cursor.fetchall()
print(column_names)
for row in data:
print(row)
# output:
# inserted row count: 8
# queried row count: 3
# ['ts', 'current', 'voltage', 'phase', 'location', 'groupid']
# [datetime.datetime(2018, 10, 3, 14, 38, 5, 500000, tzinfo=datetime.timezone(datetime.timedelta(seconds=28800), '+08:00')), 11.8, 221, 0.28, 'california.losangeles', 2]
# [datetime.datetime(2018, 10, 3, 14, 38, 16, 600000, tzinfo=datetime.timezone(datetime.timedelta(seconds=28800), '+08:00')), 13.4, 223, 0.29, 'california.losangeles', 2]
# [datetime.datetime(2018, 10, 3, 14, 38, 5, tzinfo=datetime.timezone(datetime.timedelta(seconds=28800), '+08:00')), 10.8, 223, 0.29, 'california.losangeles', 3]
cursor.execute
: Used to execute arbitrary SQL statements.cursor.rowcount
: For write operations, returns the number of successful rows written. For query operations, returns the number of rows in the result set.cursor.description
: Returns the description of the field. Please refer to TaosRestCursor for the specific format of the description information.
Use of the RestClient class
The RestClient
class is a direct wrapper for the REST API. It contains only a sql()
method for executing arbitrary SQL statements and returning the result.
from taosrest import RestClient
client = RestClient("http://localhost:6041", user="root", password="taosdata")
res: dict = client.sql("SELECT ts, current FROM power.meters LIMIT 1")
print(res)
# output:
# {'status': 'succ', 'head': ['ts', 'current'], 'column_meta': [['ts', 9, 8], ['current', 6, 4]], 'data': [[datetime.datetime(2018, 10, 3, 14, 38, 5, tzinfo=datetime.timezone(datetime.timedelta(seconds=28800), '+08:00')), 10.3]], 'rows': 1}
For a more detailed description of the sql()
method, please refer to RestClient.
Used with pandas
- native connection
- REST connection
- Native + SQLAlchemy
- REST + SQLAlchemy
import taos
import pandas
conn = taos.connect()
df: pandas.DataFrame = pandas.read_sql("SELECT * FROM meters", conn)
# print index
print(df.index)
# print data type of element in ts column
print(type(df.ts[0]))
print(df.head(3))
# output:
# RangeIndex(start=0, stop=8, step=1)
# <class 'pandas._libs.tslibs.timestamps.Timestamp'>
# ts current ... location groupid
# 0 2018-10-03 14:38:05.500 11.8 ... california.losangeles 2
# 1 2018-10-03 14:38:16.600 13.4 ... california.losangeles 2
# 2 2018-10-03 14:38:05.000 10.8 ... california.losangeles 3
import taosrest
import pandas
conn = taosrest.connect()
df: pandas.DataFrame = pandas.read_sql("SELECT * FROM power.meters", conn)
# print index
print(df.index)
# print data type of element in ts column
print(type(df.ts[0]))
print(df.head(3))
# output:
# RangeIndex(start=0, stop=8, step=1)
# <class 'pandas._libs.tslibs.timestamps.Timestamp'>
# ts current ... location groupid
# 0 2018-10-03 06:38:05.500000+00:00 11.8 ... california.losangeles 2
# 1 2018-10-03 06:38:16.600000+00:00 13.4 ... california.losangeles 2
# 2 2018-10-03 06:38:05+00:00 10.8 ... california.losangeles 3
import pandas
from sqlalchemy import create_engine
engine = create_engine("taos://root:taosdata@localhost:6030/power")
df: pandas.DataFrame = pandas.read_sql("SELECT * FROM power.meters", engine)
# print index
print(df.index)
# print data type of element in ts column
print(type(df.ts[0]))
print(df.head(3))
# output:
# RangeIndex(start=0, stop=8, step=1)
# <class 'pandas._libs.tslibs.timestamps.Timestamp'>
# ts current ... location groupid
# 0 2018-10-03 14:38:05.500 11.8 ... california.losangeles 2
# 1 2018-10-03 14:38:16.600 13.4 ... california.losangeles 2
# 2 2018-10-03 14:38:05.000 10.8 ... california.losangeles 3
import pandas
from sqlalchemy import create_engine
engine = create_engine("taosrest://root:taosdata@localhost:6041")
df: pandas.DataFrame = pandas.read_sql("SELECT * FROM power.meters", engine)
# print index
print(df.index)
# print data type of element in ts column
print(type(df.ts[0]))
print(df.head(3))
# output:
# RangeIndex(start=0, stop=8, step=1)
# <class 'pandas._libs.tslibs.timestamps.Timestamp'>
# ts current ... location groupid
# 0 2018-10-03 06:38:05.500000+00:00 11.8 ... california.losangeles 2
# 1 2018-10-03 06:38:16.600000+00:00 13.4 ... california.losangeles 2
# 2 2018-10-03 06:38:05+00:00 10.8 ... california.losangeles 3
Other sample programs
| Example program links | Example program content | | ------------------------------------------------------------------------------------------------------------- | ------------------- ---- | | bind_multi.py | parameter binding, bind multiple rows at once | | bind_row.py | bind_row.py | insert_lines.py | InfluxDB line protocol writing | | json_tag.py | Use JSON type tags | | subscribe-async.py | Asynchronous subscription | | subscribe-sync.py | synchronous-subscribe |
Other notes
Exception handling
All errors from database operations are thrown directly as exceptions and the error message from the database is passed up the exception stack. The application is responsible for exception handling. For example:
import taos
try:
conn = taos.connect()
conn.execute("CREATE TABLE 123") # wrong sql
except taos.Error as e:
print(e)
print("exception class: ", e.__class__.__name__)
print("error number:", e.errno)
print("error message:", e.msg)
except BaseException as other:
print("exception occur")
print(other)
# output:
# [0x0216]: syntax error near 'Incomplete SQL statement'
# exception class: ProgrammingError
# error number: -2147483114
# error message: syntax error near 'Incomplete SQL statement'
About nanoseconds
Due to the current imperfection of Python's nanosecond support (see link below), the current implementation returns integers at nanosecond precision instead of the datetime
type produced by ms
and us
, which application developers will need to handle on their own. And it is recommended to use pandas' to_datetime(). The Python Connector may modify the interface in the future if Python officially supports nanoseconds in full.
- https://stackoverflow.com/questions/10611328/parsing-datetime-strings-containing-nanoseconds
- https://www.python.org/dev/peps/pep-0564/
Frequently Asked Questions
Welcome to ask questions or report questions.
Important Update
Connector version | Important Update | Release date |
---|---|---|
2.3.1 | 1. support TDengine REST API 2. remove support for Python version below 3.6 | 2022-04-28 |
2.2.5 | support timezone option when connect | 2022-04-13 |
2.2.2 | support sqlalchemy dialect plugin | 2022-03-28 |
[Release Notes] (https://github.com/taosdata/taos-connector-python/releases)