Synapse SQL on-demand firstrow skipping more than just the 1st row

Question

Hi have noticed that when you set firstrow = 2 the result set has misisng rows.

This can be easily noticed:

The query below (querying a public data source) returns 41165. Setting firstrow = 3 return 41119 (my expectation is that it should only have 1 row less).

Interestingly, changing the query to select count(*) has expected behaviour (ie rowcount will decrease by 1 if firstrow is incremented).

I noticed the issues after troubleshooting a sum funtion which returned less than i was expecting.

select COUNT(c1)
from openrowset(
    bulk 'https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/ecdc_cases/latest/ecdc_cases.csv',
    format = 'csv',
    parser_version = '2.0',
    firstrow = 2) as rows

Answer 1

Thank you for raising this, we are aware of this issue. Fix for this will land soon.

In the meantime, you can use parser_version = '1.0' .

Try using this query:

select COUNT(date_rep)
    from openrowset(
        bulk 'https://pandemicdatalake.blob.core.windows.net/public/curated/covid-19/ecdc_cases/latest/ecdc_cases.csv',
        format = 'csv',
        parser_version = '1.0',
        firstrow = 3
    ) WITH (
        [date_rep] datetime2,
        [day] smallint,
        [month] smallint,
        [year] smallint,
        [cases] smallint,
        [deaths] smallint,
        [countries_and_territories] VARCHAR (100)
) AS [r]

Synapse SQL on-demand firstrow skipping more than just the 1st row

Question

1 answers

solution1
0 2020-09-30 16:30:22

Synapse SQL on-demand firstrow skipping more than just the 1st row

Question

1 answers

solution1 0 2020-09-30 16:30:22

solution1
0 2020-09-30 16:30:22