Skip to main content

Query data

Introduction

SQL is used by TDengine as its query language. Application programs can send SQL statements to TDengine through REST API or connectors. TDengine's CLI taos can also be used to execute ad hoc SQL queries. Here is the list of major query functionalities supported by TDengine:

  • Query on single column or multiple columns
  • Filter on tags or data columns: >, <, =, <>, like
  • Grouping of results: Group By
  • Sorting of results: Order By
  • Limit the number of results: Limit/Offset
  • Arithmetic on columns of numeric types or aggregate results
  • Join query with timestamp alignment
  • Aggregate functions: count, max, min, avg, sum, twa, stddev, leastsquares, top, bottom, first, last, percentile, apercentile, last_row, spread, diff

For example, the SQL statement below can be executed in TDengine CLI taos to select records with voltage greater than 215 and limit the output to only 2 rows.

select * from d1001 where voltage > 215 order by ts desc limit 2;
taos> select * from d1001 where voltage > 215 order by ts desc limit 2;
ts | current | voltage | phase |
======================================================================================
2018-10-03 14:38:16.800 | 12.30000 | 221 | 0.31000 |
2018-10-03 14:38:15.000 | 12.60000 | 218 | 0.33000 |
Query OK, 2 row(s) in set (0.001100s)

To meet the requirements of varied use cases, some special functions have been added in TDengine. Some examples are twa (Time Weighted Average), spread (The difference between the maximum and the minimum), and last_row (the last row). Furthermore, continuous query is also supported in TDengine.

For detailed query syntax please refer to Select.

Aggregation among Tables

In most use cases, there are always multiple kinds of data collection points. A new concept, called STable (abbreviation for super table), is used in TDengine to represent one type of data collection point, and a subtable is used to represent a specific data collection point of that type. Tags are used by TDengine to represent the static properties of data collection points. A specific data collection point has its own values for static properties. By specifying filter conditions on tags, aggregation can be performed efficiently among all the subtables created via the same STable, i.e. same type of data collection points. Aggregate functions applicable for tables can be used directly on STables; the syntax is exactly the same.

In summary, records across subtables can be aggregated by a simple query on their STable. It is like a join operation. However, tables belonging to different STables can not be aggregated.

Example 1

In TDengine CLI taos, use the SQL below to get the average voltage of all the meters in California grouped by location.

taos> SELECT AVG(voltage) FROM meters GROUP BY location;
avg(voltage) | location |
=============================================================
222.000000000 | California.LosAngeles |
219.200000000 | California.SanFrancisco |
Query OK, 2 row(s) in set (0.002136s)

Example 2

In TDengine CLI taos, use the SQL below to get the number of rows and the maximum current in the past 24 hours from meters whose groupId is 2.

taos> SELECT count(*), max(current) FROM meters where groupId = 2 and ts > now - 24h;
count(*) | max(current) |
==================================
5 | 13.4 |
Query OK, 1 row(s) in set (0.002136s)

Join queries are only allowed between subtables of the same STable. In Select, all query operations are marked as to whether they support STables or not.

Down Sampling and Interpolation

In IoT use cases, down sampling is widely used to aggregate data by time range. The INTERVAL keyword in TDengine can be used to simplify the query by time window. For example, the SQL statement below can be used to get the sum of current every 10 seconds from meters table d1001.

taos> SELECT sum(current) FROM d1001 INTERVAL(10s);
ts | sum(current) |
======================================================
2018-10-03 14:38:00.000 | 10.300000191 |
2018-10-03 14:38:10.000 | 24.900000572 |
Query OK, 2 row(s) in set (0.000883s)

Down sampling can also be used for STable. For example, the below SQL statement can be used to get the sum of current from all meters in California.

taos> SELECT SUM(current) FROM meters where location like "California%" INTERVAL(1s);
ts | sum(current) |
======================================================
2018-10-03 14:38:04.000 | 10.199999809 |
2018-10-03 14:38:05.000 | 32.900000572 |
2018-10-03 14:38:06.000 | 11.500000000 |
2018-10-03 14:38:15.000 | 12.600000381 |
2018-10-03 14:38:16.000 | 36.000000000 |
Query OK, 5 row(s) in set (0.001538s)

Down sampling also supports time offset. For example, the below SQL statement can be used to get the sum of current from all meters but each time window must start at the boundary of 500 milliseconds.

taos> SELECT SUM(current) FROM meters INTERVAL(1s, 500a);
ts | sum(current) |
======================================================
2018-10-03 14:38:04.500 | 11.189999809 |
2018-10-03 14:38:05.500 | 31.900000572 |
2018-10-03 14:38:06.500 | 11.600000000 |
2018-10-03 14:38:15.500 | 12.300000381 |
2018-10-03 14:38:16.500 | 35.000000000 |
Query OK, 5 row(s) in set (0.001521s)

In many use cases, it's hard to align the timestamp of the data collected by each collection point. However, a lot of algorithms like FFT require the data to be aligned with same time interval and application programs have to handle this by themselves. In TDengine, it's easy to achieve the alignment using down sampling.

Interpolation can be performed in TDengine if there is no data in a time range.

For more details please refer to Aggregate by Window.

Examples

Query

In the section describing Insert, a database named power is created and some data are inserted into STable meters. Below sample code demonstrates how to query the data in this STable.

package com.taos.example;

import java.sql.*;

public class RestQueryExample {
private static Connection getConnection() throws SQLException {
String jdbcUrl = "jdbc:TAOS-RS://localhost:6041/power?user=root&password=taosdata";
return DriverManager.getConnection(jdbcUrl);
}

private static void printRow(ResultSet rs) throws SQLException {
ResultSetMetaData meta = rs.getMetaData();
for (int i = 1; i <= meta.getColumnCount(); i++) {
String value = rs.getString(i);
System.out.print(value);
System.out.print("\t");
}
System.out.println();
}

private static void printColName(ResultSet rs) throws SQLException {
ResultSetMetaData meta = rs.getMetaData();
for (int i = 1; i <= meta.getColumnCount(); i++) {
String colLabel = meta.getColumnLabel(i);
System.out.print(colLabel);
System.out.print("\t");
}
System.out.println();
}

private static void processResult(ResultSet rs) throws SQLException {
printColName(rs);
while (rs.next()) {
printRow(rs);
}
}

private static void queryData() throws SQLException {
try (Connection conn = getConnection()) {
try (Statement stmt = conn.createStatement()) {
ResultSet rs = stmt.executeQuery("SELECT AVG(voltage) FROM meters GROUP BY location");
processResult(rs);
}
}
}

public static void main(String[] args) throws SQLException {
queryData();
}
}

// possible output:
// avg(voltage) location
// 222.0 California.LosAngeles
// 219.0 California.SanFrancisco

view source code

note
  1. With either REST connection or native connection, the above sample code works well.
  2. Please note that use db can't be used in case of REST connection because it's stateless.

Asynchronous Query

Besides synchronous queries, an asynchronous query API is also provided by TDengine to insert or query data more efficiently. With a similar hardware and software environment, the async API is 2~4 times faster than sync APIs. Async API works in non-blocking mode, which means an operation can be returned without finishing so that the calling thread can switch to other work to improve the performance of the whole application system. Async APIs perform especially better in the case of poor networks.

Please note that async query can only be used with a native connection.

import time
from ctypes import *

from taos import *


def fetch_callback(p_param, p_result, num_of_rows):
print("fetched ", num_of_rows, "rows")
p = cast(p_param, POINTER(Counter))
result = TaosResult(p_result)

if num_of_rows == 0:
print("fetching completed")
p.contents.done = True
result.close()
return
if num_of_rows < 0:
p.contents.done = True
result.check_error(num_of_rows)
result.close()
return None

for row in result.rows_iter(num_of_rows):
print(row)
p.contents.count += result.row_count
result.fetch_rows_a(fetch_callback, p_param)


def query_callback(p_param, p_result, code):
if p_result is None:
return
result = TaosResult(p_result)
if code == 0:
result.fetch_rows_a(fetch_callback, p_param)
result.check_error(code)


class Counter(Structure):
_fields_ = [("count", c_int), ("done", c_bool)]

def __str__(self):
return "{ count: %d, done: %s }" % (self.count, self.done)


def test_query(conn):
counter = Counter(count=0)
conn.query_a("select ts, current, voltage from power.meters", query_callback, byref(counter))

while not counter.done:
print(counter)
time.sleep(1)
print(counter)
conn.close()


if __name__ == "__main__":
test_query(connect())

# possible output:
# { count: 0, done: False }
# fetched 8 rows
# 1538548685000 10.300000 219
# 1538548695000 12.600000 218
# 1538548696800 12.300000 221
# 1538548696650 10.300000 218
# 1538548685500 11.800000 221
# 1538548696600 13.400000 223
# 1538548685500 10.800000 223
# 1538548686500 11.500000 221
# fetched 0 rows
# fetching completed
# { count: 8, done: True }

view source code

note

This sample code can't be run on Windows system for now.