Distinct Counts

This applies to: Visual Data Discovery

Distinct count functionality determines the number of unique values in a column or expression within a selected table by comparing all the records pulled from the data store by a data source configuration. When distinct counts are used, unique value results are returned when analyzing data. For example, distinct counts could return the number of:

Unique customers in a sales database
Unique UPC codes for a category of products
The number of trucks in a company's fleet

For example, given a single collection and string field with the following three values:

Apple
Orange
Apple

The distinct count returns 2, since there are only two distinct values (“Apple” and “Orange”), while an ordinary count returns 3 to reflect the total number of records. SQL-based connectors might produce a query that looks like this:

select count(distinct myField) from myCollection

Support for this feature by connector is shown in the following table.

Key:Y - Supported; N - Not Supported; N/A - not applicable

Connector	Supported?	Notes
Amazon Redshift	Y
Amazon S3	Y
Apache Drill	Y
Apache Phoenix	Y
Apache Phoenix Query Server (QS)	Y
Apache Solr	Y
BigQuery	Y	If you need to access a BigQuery partition, explicitly include an alias for the built in partition column in your select clause, such as `select *, _PARTITIONTIME as pt from projectId.datasetId.tableId`.
Cloudera Impala	Y	Cloudera Impala connectors can receive only a single distinct count field in a query.
Cloudera Search	Y
Couchbase	Y
Dremio	Y
Elasticsearch 7.0	Y
Elasticsearch 8.0	Y
File Upload	Y
HDFS	Y
Hive	Y
Jira	Y
MemSQL	Y
Microsoft SQL Server	Y
MongoDB	Y
MySQL	Y
Oracle	Y
PostgreSQL	Y
Python	Y
Real Time Sales	Y
Salesforce	Y
SAP Hana	Y
SAP IQ	Y
Spark SQL	Y
Snowflake	Y
Teradata	Y
TIBCO DV	Y
Trino	Y
File Upload (Upload API)	Y
Vertica	Y