Explore all Stream Processing open source software, libraries, packages, source code, cloud functions and APIs.

Popular New Releases in Stream Processing

webtorrent

v1.8.14

aria2

aria2 1.36.0

webtorrent-desktop

v0.24.0

Jackett

v0.20.933

popcorn-desktop

v0.4.7

Popular Libraries in Stream Processing

gulp

by gulpjs doticonjavascriptdoticon

star image 32240 doticonMIT

A toolkit to automate & enhance your workflow

webtorrent

by webtorrent doticonjavascriptdoticon

star image 26250 doticonMIT

⚡️ Streaming torrent client for the web

aria2

by aria2 doticonc++doticon

star image 26036 doticonNOASSERTION

aria2 is a lightweight multi-protocol & multi-source, cross platform download utility operated in command-line. It supports HTTP/HTTPS, FTP, SFTP, BitTorrent and Metalink.

ZeroNet

by HelloZeroNet doticonjavascriptdoticon

star image 16995 doticonNOASSERTION

ZeroNet - Decentralized websites using Bitcoin crypto and BitTorrent network

qBittorrent

by qbittorrent doticonc++doticon

star image 14553 doticonNOASSERTION

qBittorrent BitTorrent client

nginx-rtmp-module

by arut doticoncdoticon

star image 11056 doticonBSD-2-Clause

NGINX-based Media Streaming Server

webtorrent-desktop

by webtorrent doticonjavascriptdoticon

star image 8773 doticonMIT

❤️ Streaming torrent app for Mac, Windows, and Linux

Jackett

by Jackett doticoncsharpdoticon

star image 7299 doticonGPL-2.0

API Support for your favorite torrent trackers

Sonarr

by Sonarr doticoncsharpdoticon

star image 7123 doticonNOASSERTION

Smart PVR for newsgroup and bittorrent users.

Trending New libraries in Stream Processing

exatorrent

by varbhat doticongodoticon

star image 1428 doticonGPL-3.0

Easy to Use Torrent Client. Can be hosted in Cloud. Files can be streamed in Browser/Media Player.

bobarr

by iam4x doticontypescriptdoticon

star image 955 doticonMIT

🍿 The all-in-one alternative for Sonarr, Radarr, Jackett... with a VPN and running in docker

engine

by Monibuca doticongodoticon

star image 542 doticonMIT

Monibuca 核心引擎,包含流媒体核心转发逻辑,需要配合功能插件一起组合运行

timetrace

by dominikbraun doticongodoticon

star image 487 doticonApache-2.0

timetrace is a simple CLI for tracking your working time.

Tidal-Media-Downloader-PRO

by yaronzz doticoncsharpdoticon

star image 456 doticonApache-2.0

Download 'TIDAL' Music On Windows/Linux/MacOs (PYTHON/C#)

RedTeamCCode

by Mr-Un1k0d3r doticoncdoticon

star image 328 doticon

Red Team C code repo

Correios-Brasil

by FinotiLucas doticontypescriptdoticon

star image 320 doticonApache-2.0

Módulo completo consultar informações sobre o CEP, calcular o preço e os prazos das entregas das encomendas e também realizar o rastreio de multiplos produtos !

kafka-streams-dotnet

by LGouellec doticoncsharpdoticon

star image 256 doticonMIT

.NET Stream Processing Library for Apache Kafka 🚀

cratetorrent

by mandreyel doticonrustdoticon

star image 240 doticon

A BitTorrent V1 engine library for Rust (and currently Linux)

Top Authors in Stream Processing

1

mafintosh

28 Libraries

star icon13067

2

webtorrent

24 Libraries

star icon43720

3

bartbutenaers

15 Libraries

star icon87

4

node-red

11 Libraries

star icon1659

5

DeanCording

11 Libraries

star icon118

6

Raynos

11 Libraries

star icon159

7

hkjang

10 Libraries

star icon24

8

pull-stream

10 Libraries

star icon99

9

substack

9 Libraries

star icon582

10

flow-io

9 Libraries

star icon19

1

28 Libraries

star icon13067

2

24 Libraries

star icon43720

3

15 Libraries

star icon87

4

11 Libraries

star icon1659

5

11 Libraries

star icon118

6

11 Libraries

star icon159

7

10 Libraries

star icon24

8

10 Libraries

star icon99

9

9 Libraries

star icon582

10

9 Libraries

star icon19

Trending Kits in Stream Processing

Developers widely use Python Stream processing to query ongoing data streams and respond to important events in timeframes ranging from milliseconds to minutes. Complex event processing, Real-time analytics, and streaming analytics are all closely linked to stream processing, which is now the preliminary framework for executing these use cases. 

 

Stream processing engines are runtime libraries that permit coders to write code to process streaming data with not having to deal with low-level streaming mechanics. Data were traditionally processed in batches based on a schedule or predefined point (for instance, each night at 1 am, every hundred rows, or every time the volume reached two megabytes). However, as data volumes and speeds have increased, more than batch processing is needed for many applications. Python Stream processing has evolved into a must-have feature for modern applications. For various use cases and applications, enterprises have turned to technologies that respond to data as it is created. Stream processing enables applications to respond to new data events as they happen. Unlike batch processing, which groups data and collects it at predetermined intervals, stream processing applications collect and process data when it is generated.

 

Python Stream processing is most commonly used with data generated as a series of events, such as IoT sensor data, payment processing systems, servers, and application logs. The two common paradigms are publisher/subscriber (also known as pub/sub) and source/sink. A publisher or source generates data and events, which are then delivered to a stream processing application. Here the data might be augmented, tested against fraud detection algorithms, or otherwise transformed before being sent to a subscriber or sink. Furthermore, all major cloud services, such as Tensorflow, Numpy, and Pytorch, have native services that simplify stream processing development on their respective platforms.


Check out the list below to find more popular Python stream-processing libraries for your applications: 

Trending Discussions on Stream Processing

Unhandled error Error: Data cannot be encoded in JSON error at firebase serverless functions

Flink missing windows generated on some partitions

Do I need a JAR file to run a Flink application?

Calling Hibernate in Spring cloud Stream

Filtering in Kafka and other streaming technologies

Pexpect Multi-Threading Idle State

Apache Flink - how to stop and resume stream processing on downstream failure

Hazelcast IMDG vs Hazelcast Jet config

What is benefit of using Kafka streams?

How to filter data in Kafka?

QUESTION

Unhandled error Error: Data cannot be encoded in JSON error at firebase serverless functions

Asked 2022-Apr-04 at 23:15

I'm trying to deploy an api for my application. Using these codes raises Unhandled error "Error: Data cannot be encoded in JSON.

1const functions = require("firebase-functions");
2const axios = require("axios");
3exports.getDatas = functions.https.onCall(async (d)=>{
4  functions.logger.log(d["name"]);
5  cname = d["name"];
6  ts1=d["ts1"];
7  ts2=d["ts2"];
8  const data =  await axios.get(
9    "https://api.coingecko.com/api/v3/coins/" +
10    cname +
11      "/market_chart/range?vs_currency=usd&from=" +
12      ts1 +
13      "&to=" +
14      ts2,
15  );
16  functions.logger.log(data);
17  return {data: data};
18});
19

The error log is

1const functions = require("firebase-functions");
2const axios = require("axios");
3exports.getDatas = functions.https.onCall(async (d)=>{
4  functions.logger.log(d["name"]);
5  cname = d["name"];
6  ts1=d["ts1"];
7  ts2=d["ts2"];
8  const data =  await axios.get(
9    "https://api.coingecko.com/api/v3/coins/" +
10    cname +
11      "/market_chart/range?vs_currency=usd&from=" +
12      ts1 +
13      "&to=" +
14      ts2,
15  );
16  functions.logger.log(data);
17  return {data: data};
18});
19Unhandled error Error: Data cannot be encoded in JSON: function httpAdapter(config) {
20  return new Promise(function dispatchHttpRequest(resolvePromise, rejectPromise) {
21    var onCanceled;
22    function done() {
23      if (config.cancelToken) {
24        config.cancelToken.unsubscribe(onCanceled);
25      }
26
27      if (config.signal) {
28        config.signal.removeEventListener('abort', onCanceled);
29      }
30    }
31    var resolve = function resolve(value) {
32      done();
33      resolvePromise(value);
34    };
35    var rejected = false;
36    var reject = function reject(value) {
37      done();
38      rejected = true;
39      rejectPromise(value);
40    };
41    var data = config.data;
42    var headers = config.headers;
43    var headerNames = {};
44
45    Object.keys(headers).forEach(function storeLowerName(name) {
46      headerNames[name.toLowerCase()] = name;
47    });
48
49    // Set User-Agent (required by some servers)
50    // See https://github.com/axios/axios/issues/69
51    if ('user-agent' in headerNames) {
52      // User-Agent is specified; handle case where no UA header is desired
53      if (!headers[headerNames['user-agent']]) {
54        delete headers[headerNames['user-agent']];
55      }
56      // Otherwise, use specified value
57    } else {
58      // Only set header if it hasn't been set in config
59      headers['User-Agent'] = 'axios/' + VERSION;
60    }
61
62    if (data && !utils.isStream(data)) {
63      if (Buffer.isBuffer(data)) {
64        // Nothing to do...
65      } else if (utils.isArrayBuffer(data)) {
66        data = Buffer.from(new Uint8Array(data));
67      } else if (utils.isString(data)) {
68        data = Buffer.from(data, 'utf-8');
69      } else {
70        return reject(createError(
71          'Data after transformation must be a string, an ArrayBuffer, a Buffer, or a Stream',
72          config
73        ));
74      }
75
76      if (config.maxBodyLength > -1 && data.length > config.maxBodyLength) {
77        return reject(createError('Request body larger than maxBodyLength limit', config));
78      }
79
80      // Add Content-Length header if data exists
81      if (!headerNames['content-length']) {
82        headers['Content-Length'] = data.length;
83      }
84    }
85
86    // HTTP basic authentication
87    var auth = undefined;
88    if (config.auth) {
89      var username = config.auth.username || '';
90      var password = config.auth.password || '';
91      auth = username + ':' + password;
92    }
93
94    // Parse url
95    var fullPath = buildFullPath(config.baseURL, config.url);
96    var parsed = url.parse(fullPath);
97    var protocol = parsed.protocol || 'http:';
98
99    if (!auth && parsed.auth) {
100      var urlAuth = parsed.auth.split(':');
101      var urlUsername = urlAuth[0] || '';
102      var urlPassword = urlAuth[1] || '';
103      auth = urlUsername + ':' + urlPassword;
104    }
105
106    if (auth && headerNames.authorization) {
107      delete headers[headerNames.authorization];
108    }
109
110    var isHttpsRequest = isHttps.test(protocol);
111    var agent = isHttpsRequest ? config.httpsAgent : config.httpAgent;
112
113    var options = {
114      path: buildURL(parsed.path, config.params, config.paramsSerializer).replace(/^\?/, ''),
115      method: config.method.toUpperCase(),
116      headers: headers,
117      agent: agent,
118      agents: { http: config.httpAgent, https: config.httpsAgent },
119      auth: auth
120    };
121
122    if (config.socketPath) {
123      options.socketPath = config.socketPath;
124    } else {
125      options.hostname = parsed.hostname;
126      options.port = parsed.port;
127    }
128
129    var proxy = config.proxy;
130    if (!proxy && proxy !== false) {
131      var proxyEnv = protocol.slice(0, -1) + '_proxy';
132      var proxyUrl = process.env[proxyEnv] || process.env[proxyEnv.toUpperCase()];
133      if (proxyUrl) {
134        var parsedProxyUrl = url.parse(proxyUrl);
135        var noProxyEnv = process.env.no_proxy || process.env.NO_PROXY;
136        var shouldProxy = true;
137
138        if (noProxyEnv) {
139          var noProxy = noProxyEnv.split(',').map(function trim(s) {
140            return s.trim();
141          });
142
143          shouldProxy = !noProxy.some(function proxyMatch(proxyElement) {
144            if (!proxyElement) {
145              return false;
146            }
147            if (proxyElement === '*') {
148              return true;
149            }
150            if (proxyElement[0] === '.' &&
151                parsed.hostname.substr(parsed.hostname.length - proxyElement.length) === proxyElement) {
152              return true;
153            }
154
155            return parsed.hostname === proxyElement;
156          });
157        }
158
159        if (shouldProxy) {
160          proxy = {
161            host: parsedProxyUrl.hostname,
162            port: parsedProxyUrl.port,
163            protocol: parsedProxyUrl.protocol
164          };
165
166          if (parsedProxyUrl.auth) {
167            var proxyUrlAuth = parsedProxyUrl.auth.split(':');
168            proxy.auth = {
169              username: proxyUrlAuth[0],
170              password: proxyUrlAuth[1]
171            };
172          }
173        }
174      }
175    }
176
177    if (proxy) {
178      options.headers.host = parsed.hostname + (parsed.port ? ':' + parsed.port : '');
179      setProxy(options, proxy, protocol + '//' + parsed.hostname + (parsed.port ? ':' + parsed.port : '') + options.path);
180    }
181
182    var transport;
183    var isHttpsProxy = isHttpsRequest && (proxy ? isHttps.test(proxy.protocol) : true);
184    if (config.transport) {
185      transport = config.transport;
186    } else if (config.maxRedirects === 0) {
187      transport = isHttpsProxy ? https : http;
188    } else {
189      if (config.maxRedirects) {
190        options.maxRedirects = config.maxRedirects;
191      }
192      transport = isHttpsProxy ? httpsFollow : httpFollow;
193    }
194
195    if (config.maxBodyLength > -1) {
196      options.maxBodyLength = config.maxBodyLength;
197    }
198
199    if (config.insecureHTTPParser) {
200      options.insecureHTTPParser = config.insecureHTTPParser;
201    }
202
203    // Create the request
204    var req = transport.request(options, function handleResponse(res) {
205      if (req.aborted) return;
206
207      // uncompress the response body transparently if required
208      var stream = res;
209
210      // return the last request in case of redirects
211      var lastRequest = res.req || req;
212
213
214      // if no content, is HEAD request or decompress disabled we should not decompress
215      if (res.statusCode !== 204 && lastRequest.method !== 'HEAD' && config.decompress !== false) {
216        switch (res.headers['content-encoding']) {
217        /*eslint default-case:0*/
218        case 'gzip':
219        case 'compress':
220        case 'deflate':
221        // add the unzipper to the body stream processing pipeline
222          stream = stream.pipe(zlib.createUnzip());
223
224          // remove the content-encoding in order to not confuse downstream operations
225          delete res.headers['content-encoding'];
226          break;
227        }
228      }
229
230      var response = {
231        status: res.statusCode,
232        statusText: res.statusMessage,
233        headers: res.headers,
234        config: config,
235        request: lastRequest
236      };
237
238      if (config.responseType === 'stream') {
239        response.data = stream;
240        settle(resolve, reject, response);
241      } else {
242        var responseBuffer = [];
243        var totalResponseBytes = 0;
244        stream.on('data', function handleStreamData(chunk) {
245          responseBuffer.push(chunk);
246          totalResponseBytes += chunk.length;
247
248          // make sure the content length is not over the maxContentLength if specified
249          if (config.maxContentLength > -1 && totalResponseBytes > config.maxContentLength) {
250            // stream.destoy() emit aborted event before calling reject() on Node.js v16
251            rejected = true;
252            stream.destroy();
253            reject(createError('maxContentLength size of ' + config.maxContentLength + ' exceeded',
254              config, null, lastRequest));
255          }
256        });
257
258        stream.on('aborted', function handlerStreamAborted() {
259          if (rejected) {
260            return;
261          }
262          stream.destroy();
263          reject(createError('error request aborted', config, 'ERR_REQUEST_ABORTED', lastRequest));
264        });
265
266        stream.on('error', function handleStreamError(err) {
267          if (req.aborted) return;
268          reject(enhanceError(err, config, null, lastRequest));
269        });
270
271        stream.on('end', function handleStreamEnd() {
272          try {
273            var responseData = responseBuffer.length === 1 ? responseBuffer[0] : Buffer.concat(responseBuffer);
274            if (config.responseType !== 'arraybuffer') {
275              responseData = responseData.toString(config.responseEncoding);
276              if (!config.responseEncoding || config.responseEncoding === 'utf8') {
277                responseData = utils.stripBOM(responseData);
278              }
279            }
280            response.data = responseData;
281          } catch (err) {
282            reject(enhanceError(err, config, err.code, response.request, response));
283          }
284          settle(resolve, reject, response);
285        });
286      }
287    });
288
289    // Handle errors
290    req.on('error', function handleRequestError(err) {
291      if (req.aborted && err.code !== 'ERR_FR_TOO_MANY_REDIRECTS') return;
292      reject(enhanceError(err, config, null, req));
293    });
294
295    // set tcp keep alive to prevent drop connection by peer
296    req.on('socket', function handleRequestSocket(socket) {
297      // default interval of sending ack packet is 1 minute
298      socket.setKeepAlive(true, 1000 * 60);
299    });
300
301    // Handle request timeout
302    if (config.timeout) {
303      // This is forcing a int timeout to avoid problems if the `req` interface doesn't handle other types.
304      var timeout = parseInt(config.timeout, 10);
305
306      if (isNaN(timeout)) {
307        reject(createError(
308          'error trying to parse `config.timeout` to int',
309          config,
310          'ERR_PARSE_TIMEOUT',
311          req
312        ));
313
314        return;
315      }
316
317      // Sometime, the response will be very slow, and does not respond, the connect event will be block by event loop system.
318      // And timer callback will be fired, and abort() will be invoked before connection, then get "socket hang up" and code ECONNRESET.
319      // At this time, if we have a large number of request, nodejs will hang up some socket on background. and the number will up and up.
320      // And then these socket which be hang up will devoring CPU little by little.
321      // ClientRequest.setTimeout will be fired on the specify milliseconds, and can make sure that abort() will be fired after connect.
322      req.setTimeout(timeout, function handleRequestTimeout() {
323        req.abort();
324        var transitional = config.transitional || defaults.transitional;
325        reject(createError(
326          'timeout of ' + timeout + 'ms exceeded',
327          config,
328          transitional.clarifyTimeoutError ? 'ETIMEDOUT' : 'ECONNABORTED',
329          req
330        ));
331      });
332    }
333
334    if (config.cancelToken || config.signal) {
335      // Handle cancellation
336      // eslint-disable-next-line func-names
337      onCanceled = function(cancel) {
338        if (req.aborted) return;
339
340        req.abort();
341        reject(!cancel || (cancel && cancel.type) ? new Cancel('canceled') : cancel);
342      };
343
344      config.cancelToken && config.cancelToken.subscribe(onCanceled);
345      if (config.signal) {
346        config.signal.aborted ? onCanceled() : config.signal.addEventListener('abort', onCanceled);
347      }
348    }
349
350
351    // Send the request
352    if (utils.isStream(data)) {
353      data.on('error', function handleStreamError(err) {
354        reject(enhanceError(err, config, null, req));
355      }).pipe(req);
356    } else {
357      req.end(data);
358    }
359  });
360}
361    at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:162:11)
362    at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
363    at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
364    at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
365    at /workspace/node_modules/firebase-functions/lib/common/providers/https.js:334:22
366    at processTicksAndRejections (internal/process/task_queues.js:97:5) 
367

First logger logs the parameter i gave correctly and the logger that logs data is in this format:

1const functions = require("firebase-functions");
2const axios = require("axios");
3exports.getDatas = functions.https.onCall(async (d)=>{
4  functions.logger.log(d["name"]);
5  cname = d["name"];
6  ts1=d["ts1"];
7  ts2=d["ts2"];
8  const data =  await axios.get(
9    "https://api.coingecko.com/api/v3/coins/" +
10    cname +
11      "/market_chart/range?vs_currency=usd&from=" +
12      ts1 +
13      "&to=" +
14      ts2,
15  );
16  functions.logger.log(data);
17  return {data: data};
18});
19Unhandled error Error: Data cannot be encoded in JSON: function httpAdapter(config) {
20  return new Promise(function dispatchHttpRequest(resolvePromise, rejectPromise) {
21    var onCanceled;
22    function done() {
23      if (config.cancelToken) {
24        config.cancelToken.unsubscribe(onCanceled);
25      }
26
27      if (config.signal) {
28        config.signal.removeEventListener('abort', onCanceled);
29      }
30    }
31    var resolve = function resolve(value) {
32      done();
33      resolvePromise(value);
34    };
35    var rejected = false;
36    var reject = function reject(value) {
37      done();
38      rejected = true;
39      rejectPromise(value);
40    };
41    var data = config.data;
42    var headers = config.headers;
43    var headerNames = {};
44
45    Object.keys(headers).forEach(function storeLowerName(name) {
46      headerNames[name.toLowerCase()] = name;
47    });
48
49    // Set User-Agent (required by some servers)
50    // See https://github.com/axios/axios/issues/69
51    if ('user-agent' in headerNames) {
52      // User-Agent is specified; handle case where no UA header is desired
53      if (!headers[headerNames['user-agent']]) {
54        delete headers[headerNames['user-agent']];
55      }
56      // Otherwise, use specified value
57    } else {
58      // Only set header if it hasn't been set in config
59      headers['User-Agent'] = 'axios/' + VERSION;
60    }
61
62    if (data && !utils.isStream(data)) {
63      if (Buffer.isBuffer(data)) {
64        // Nothing to do...
65      } else if (utils.isArrayBuffer(data)) {
66        data = Buffer.from(new Uint8Array(data));
67      } else if (utils.isString(data)) {
68        data = Buffer.from(data, 'utf-8');
69      } else {
70        return reject(createError(
71          'Data after transformation must be a string, an ArrayBuffer, a Buffer, or a Stream',
72          config
73        ));
74      }
75
76      if (config.maxBodyLength > -1 && data.length > config.maxBodyLength) {
77        return reject(createError('Request body larger than maxBodyLength limit', config));
78      }
79
80      // Add Content-Length header if data exists
81      if (!headerNames['content-length']) {
82        headers['Content-Length'] = data.length;
83      }
84    }
85
86    // HTTP basic authentication
87    var auth = undefined;
88    if (config.auth) {
89      var username = config.auth.username || '';
90      var password = config.auth.password || '';
91      auth = username + ':' + password;
92    }
93
94    // Parse url
95    var fullPath = buildFullPath(config.baseURL, config.url);
96    var parsed = url.parse(fullPath);
97    var protocol = parsed.protocol || 'http:';
98
99    if (!auth && parsed.auth) {
100      var urlAuth = parsed.auth.split(':');
101      var urlUsername = urlAuth[0] || '';
102      var urlPassword = urlAuth[1] || '';
103      auth = urlUsername + ':' + urlPassword;
104    }
105
106    if (auth && headerNames.authorization) {
107      delete headers[headerNames.authorization];
108    }
109
110    var isHttpsRequest = isHttps.test(protocol);
111    var agent = isHttpsRequest ? config.httpsAgent : config.httpAgent;
112
113    var options = {
114      path: buildURL(parsed.path, config.params, config.paramsSerializer).replace(/^\?/, ''),
115      method: config.method.toUpperCase(),
116      headers: headers,
117      agent: agent,
118      agents: { http: config.httpAgent, https: config.httpsAgent },
119      auth: auth
120    };
121
122    if (config.socketPath) {
123      options.socketPath = config.socketPath;
124    } else {
125      options.hostname = parsed.hostname;
126      options.port = parsed.port;
127    }
128
129    var proxy = config.proxy;
130    if (!proxy && proxy !== false) {
131      var proxyEnv = protocol.slice(0, -1) + '_proxy';
132      var proxyUrl = process.env[proxyEnv] || process.env[proxyEnv.toUpperCase()];
133      if (proxyUrl) {
134        var parsedProxyUrl = url.parse(proxyUrl);
135        var noProxyEnv = process.env.no_proxy || process.env.NO_PROXY;
136        var shouldProxy = true;
137
138        if (noProxyEnv) {
139          var noProxy = noProxyEnv.split(',').map(function trim(s) {
140            return s.trim();
141          });
142
143          shouldProxy = !noProxy.some(function proxyMatch(proxyElement) {
144            if (!proxyElement) {
145              return false;
146            }
147            if (proxyElement === '*') {
148              return true;
149            }
150            if (proxyElement[0] === '.' &&
151                parsed.hostname.substr(parsed.hostname.length - proxyElement.length) === proxyElement) {
152              return true;
153            }
154
155            return parsed.hostname === proxyElement;
156          });
157        }
158
159        if (shouldProxy) {
160          proxy = {
161            host: parsedProxyUrl.hostname,
162            port: parsedProxyUrl.port,
163            protocol: parsedProxyUrl.protocol
164          };
165
166          if (parsedProxyUrl.auth) {
167            var proxyUrlAuth = parsedProxyUrl.auth.split(':');
168            proxy.auth = {
169              username: proxyUrlAuth[0],
170              password: proxyUrlAuth[1]
171            };
172          }
173        }
174      }
175    }
176
177    if (proxy) {
178      options.headers.host = parsed.hostname + (parsed.port ? ':' + parsed.port : '');
179      setProxy(options, proxy, protocol + '//' + parsed.hostname + (parsed.port ? ':' + parsed.port : '') + options.path);
180    }
181
182    var transport;
183    var isHttpsProxy = isHttpsRequest && (proxy ? isHttps.test(proxy.protocol) : true);
184    if (config.transport) {
185      transport = config.transport;
186    } else if (config.maxRedirects === 0) {
187      transport = isHttpsProxy ? https : http;
188    } else {
189      if (config.maxRedirects) {
190        options.maxRedirects = config.maxRedirects;
191      }
192      transport = isHttpsProxy ? httpsFollow : httpFollow;
193    }
194
195    if (config.maxBodyLength > -1) {
196      options.maxBodyLength = config.maxBodyLength;
197    }
198
199    if (config.insecureHTTPParser) {
200      options.insecureHTTPParser = config.insecureHTTPParser;
201    }
202
203    // Create the request
204    var req = transport.request(options, function handleResponse(res) {
205      if (req.aborted) return;
206
207      // uncompress the response body transparently if required
208      var stream = res;
209
210      // return the last request in case of redirects
211      var lastRequest = res.req || req;
212
213
214      // if no content, is HEAD request or decompress disabled we should not decompress
215      if (res.statusCode !== 204 && lastRequest.method !== 'HEAD' && config.decompress !== false) {
216        switch (res.headers['content-encoding']) {
217        /*eslint default-case:0*/
218        case 'gzip':
219        case 'compress':
220        case 'deflate':
221        // add the unzipper to the body stream processing pipeline
222          stream = stream.pipe(zlib.createUnzip());
223
224          // remove the content-encoding in order to not confuse downstream operations
225          delete res.headers['content-encoding'];
226          break;
227        }
228      }
229
230      var response = {
231        status: res.statusCode,
232        statusText: res.statusMessage,
233        headers: res.headers,
234        config: config,
235        request: lastRequest
236      };
237
238      if (config.responseType === 'stream') {
239        response.data = stream;
240        settle(resolve, reject, response);
241      } else {
242        var responseBuffer = [];
243        var totalResponseBytes = 0;
244        stream.on('data', function handleStreamData(chunk) {
245          responseBuffer.push(chunk);
246          totalResponseBytes += chunk.length;
247
248          // make sure the content length is not over the maxContentLength if specified
249          if (config.maxContentLength > -1 && totalResponseBytes > config.maxContentLength) {
250            // stream.destoy() emit aborted event before calling reject() on Node.js v16
251            rejected = true;
252            stream.destroy();
253            reject(createError('maxContentLength size of ' + config.maxContentLength + ' exceeded',
254              config, null, lastRequest));
255          }
256        });
257
258        stream.on('aborted', function handlerStreamAborted() {
259          if (rejected) {
260            return;
261          }
262          stream.destroy();
263          reject(createError('error request aborted', config, 'ERR_REQUEST_ABORTED', lastRequest));
264        });
265
266        stream.on('error', function handleStreamError(err) {
267          if (req.aborted) return;
268          reject(enhanceError(err, config, null, lastRequest));
269        });
270
271        stream.on('end', function handleStreamEnd() {
272          try {
273            var responseData = responseBuffer.length === 1 ? responseBuffer[0] : Buffer.concat(responseBuffer);
274            if (config.responseType !== 'arraybuffer') {
275              responseData = responseData.toString(config.responseEncoding);
276              if (!config.responseEncoding || config.responseEncoding === 'utf8') {
277                responseData = utils.stripBOM(responseData);
278              }
279            }
280            response.data = responseData;
281          } catch (err) {
282            reject(enhanceError(err, config, err.code, response.request, response));
283          }
284          settle(resolve, reject, response);
285        });
286      }
287    });
288
289    // Handle errors
290    req.on('error', function handleRequestError(err) {
291      if (req.aborted && err.code !== 'ERR_FR_TOO_MANY_REDIRECTS') return;
292      reject(enhanceError(err, config, null, req));
293    });
294
295    // set tcp keep alive to prevent drop connection by peer
296    req.on('socket', function handleRequestSocket(socket) {
297      // default interval of sending ack packet is 1 minute
298      socket.setKeepAlive(true, 1000 * 60);
299    });
300
301    // Handle request timeout
302    if (config.timeout) {
303      // This is forcing a int timeout to avoid problems if the `req` interface doesn't handle other types.
304      var timeout = parseInt(config.timeout, 10);
305
306      if (isNaN(timeout)) {
307        reject(createError(
308          'error trying to parse `config.timeout` to int',
309          config,
310          'ERR_PARSE_TIMEOUT',
311          req
312        ));
313
314        return;
315      }
316
317      // Sometime, the response will be very slow, and does not respond, the connect event will be block by event loop system.
318      // And timer callback will be fired, and abort() will be invoked before connection, then get "socket hang up" and code ECONNRESET.
319      // At this time, if we have a large number of request, nodejs will hang up some socket on background. and the number will up and up.
320      // And then these socket which be hang up will devoring CPU little by little.
321      // ClientRequest.setTimeout will be fired on the specify milliseconds, and can make sure that abort() will be fired after connect.
322      req.setTimeout(timeout, function handleRequestTimeout() {
323        req.abort();
324        var transitional = config.transitional || defaults.transitional;
325        reject(createError(
326          'timeout of ' + timeout + 'ms exceeded',
327          config,
328          transitional.clarifyTimeoutError ? 'ETIMEDOUT' : 'ECONNABORTED',
329          req
330        ));
331      });
332    }
333
334    if (config.cancelToken || config.signal) {
335      // Handle cancellation
336      // eslint-disable-next-line func-names
337      onCanceled = function(cancel) {
338        if (req.aborted) return;
339
340        req.abort();
341        reject(!cancel || (cancel && cancel.type) ? new Cancel('canceled') : cancel);
342      };
343
344      config.cancelToken && config.cancelToken.subscribe(onCanceled);
345      if (config.signal) {
346        config.signal.aborted ? onCanceled() : config.signal.addEventListener('abort', onCanceled);
347      }
348    }
349
350
351    // Send the request
352    if (utils.isStream(data)) {
353      data.on('error', function handleStreamError(err) {
354        reject(enhanceError(err, config, null, req));
355      }).pipe(req);
356    } else {
357      req.end(data);
358    }
359  });
360}
361    at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:162:11)
362    at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
363    at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
364    at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
365    at /workspace/node_modules/firebase-functions/lib/common/providers/https.js:334:22
366    at processTicksAndRejections (internal/process/task_queues.js:97:5) 
367...["api.coingecko.com:443::::::::::::::::::"]},"keepAliveMsecs":1000,"maxFreeSockets":256,"scheduling":"fifo","keepAlive":false,"maxSockets":null},"_removedConnection":false,"writable":true},"status":200,"data":{"prices":[[1615345414698,37.27069164629981],[1615349310788,36.95627388647297],[1615352802175,37.48630338203377],[1615356202751,37.46442850999597],[1615360079361,37.642735963063906],[1615363905145,38.29435586902702],[1615367492353,38.313292928237594],[1615370461299,38.75503558097479],[1615374138056,38.24406575020552],[1615377815960,38.237026584388175],[1615381321332,38.93964664468625],[1615384813000,39.262646397955635],[1615388739874,39.15882057568881],[1615392094129,38.94488140309047],[1615395966875,38.79820936257378],[1615399312625,38.51637055616189],[1615403055037,38.59237008394828],[1615406529740,38.44087305010874],[1615410281814,37.71855645797291],[1615414278815,38.374824600586976],[1615417716420,38.4538669693684],[1615421045728,37.62772478442999],[1615425672990,36.8826465121472],[1615429587089,37.41958697414903],[1615432278494,37.34865694722488],[1615435254265,37.16289143388951],[1615439122292,37.14731463575248],[1615442523394,36.801517989796814],[1615446290102,37.02248224990424],[1615450361470,36.164787531097126],[1615453299572,36.46191265162147],[1615457172317,36.174755169666334],[1615460886498,37.05778010952229],[1615464298322,37.336909500902365],[1615469586325,37.56497212211488],[1615472126260,37.83046394206218],[1615474882979,37.252561357731096],[1615478498201,36.56190097084664],[1615482336185,36.83824760787625],[1615485957910,36.89351702770813],[1615489642151,37.589229946501746],[1615493390438,37.33184737771527],[1615496666244,37.29234576242379],[1615500577712,37.284260441548426],[1616866645601,1137195941.0307472],[1616870299925,1089416195.9864128],[1616873841648,1074341877.495249],[1616877368137,1061555457.3375872],[1616880970910,1077775411.1216433],[1616884693948,1064594490.6022671],[1616887998472,1087481667.611567],[1616891397951,1068140794.5197278],[1616894759953,1078753362.1719048],[1616898371565,1053546315.1245787],[1616902002474,1052498816.7223371],[1616905584364,1026915395.5541993],[1616909101481,1022271206.3215427],[1616912730390,997185793.1210617],[1616916434482,972130048.6316774],[1616919928611,988711196.2721183],[1616923534317,987299160.6191593],[1616926264719,975360472.6011684],[1616930074136,958327264.7346151],[1616933292776,935085970.8922312],[1616936940791,896217168.3654604],[1616940936847,878876312.6707534],[1616944090304,890504985.5476977],[1616948321869,896715385.5657766],[1616952007508,870767231.0865391],[1616955544207,880601758.4610194],[1616958381375,896794852.1077055],[1616962022167,929362788.5783823],[1616966479654,927502494.4691795],[1616969648773,880385481.5284289],[1616973545649,862329007.9935848],[1616977463095,840138544.6360805],[1616980359587,849727926.595521],[1616984356096,820616225.3306137],[1616987602367,898085663.0760688],[1616990444958,890215727.4112909],[1616995470635,914823340.6343507],[1616999032159,890922230.685704],[1617002651977,937214914.0703756],[1617005329558,976030203.3879734],[1617009370471,1061898884.4388478],[1617012348377,1111994349.2592206],[1617015705482,1175310227.1595278],[1617019895549,1217044915.3900926],[1617022941451,1204239369.9336267],[1617027118715,1225123359.1178432],[1617031210170,1191418570.9198012],[1617033728601,1257085051.9742537],[1617037882992,1261291734.3667347],[1617041858553,1265805909.4506621],[1617044547418,1261869965.5784621],[1617049418534,1225924891.220988],[1617052450394,1200646247.466799],[1617055896172,1209247034.0807025],[1617059684123,1249662106.3996315],[1617062561979,837849935.5380555],[1617066155823,1261094295.2039979],[1617070572708,1244044711.3556864],[1617074210159,1178503497.252399],[1617077106612,1184744920.254339],[1617080571662,1219164970.9205332],[1617084836477,1174744890.1399443],[1617087739776,1236332180.5454476],[1617092763739,1121685108.4046226],[1617096303391,1074005978.1362224],[1617100013739,1075898891.906641],[1617102136947,1041120230.0169744],[1617106411165,1021062028.7444541],[1617110588848,1004207600.6385714],[1617114148509,983098685.435342],[1617117449987,983878432.6976557],[1617120868725,943893192.0239582],[1617123806180,948379973.8680001],[1617128347360,948328240.0510467],[1617131244094,923477307.6495335],[1617134866719,918321070.6284192],[1617138697011,960178009.2986945],[1617142067857,974105207.7725881],[1617146083923,973959760.0729104],[1617149999086,959500047.5209063],[1617153094367,1007753562.6156206],[1617156698445,1021534121.1115336],[1617160175611,1028067427.0339341],[1617163928330,1007755251.8882328],[1617166924538,1023240773.0466446],[1617171886674,1037535813.1806505],[1617175133694,1101375379.7094195],[1617178435173,1136688478.90344],[1617182857658,1208366620.2561867],[1617185353773,1208823054.3509212],[1617188828477,1234197192.568771],[1617193393471,1707076315.380663],[1617196301983,1845668637.7358408],[1617199516026,1901877634.1385415],[1617203681947,2015292037.1305778],[1617207515426,2141098631.115179],[1617210224998,2343473154.2871637],[1617214323265,2329074198.4966955],[1617217968405,2461828129.1798186],[1617221653017,2493042958.539376],[1617224582971,2532015555.7692595],[1617228589364,2508661361.110037],[1617232204720,2590057969.924583],[1617235260464,2749780924.550207],[1617239367664,2791689438.967896],[1617243152558,2778422749.5901804],[1617246573894,2802892972.2612605],[1617250114952,2795446026.902383],[1617253276300,2837092221.188881],[1617257741390,2957061611.281718],[1617261111556,3025594776.954216],[1617264301698,3140730366.12618],[1617267704421,3230797741.627739],[1617272276500,3247001347.7404704],[1617275862720,3182990384.8873067],[1617279129292,2889317168.9977646],[1617283053665,2753527702.506779],[1617287046529,2700392654.8781624],[1617290204012,2616296684.424929],[1617293298853,2494255828.9768047],[1617296557242,2383424694.8900166],[1617301325511,2288268623.177356],[1617303766777,2297155897.636895],[1617307669347,2314935325.319679],[1617311721980,2259716784.056617],[1617314946823,2267889595.9127536],[1617319572007,2174169254.528509],[1617323182318,2097690604.8152165],[1617326033792,2110975746.1916978],[1617329489226,2126100629.800452],[1617332409284,2193182655.044224],[1617337211709,2199847063.5248647],[1617340611316,2167549077.601362],[1617344146863,2110348803.8388174],[1617347361962,2023115590.5637138],[1617351380142,1864316761.5098753],[1617354151186,1788973202.0040677],[1617359277447,1731207666.0376515],[1617361312976,1418566500.3106787],[1617366169158,1693859181.5518322],[1617369860769,1656689094.290342],[1617372306072,1660176536.7450612],[1617376754676,1722154482.4234965],[1617379285817,1915067128.493045],[1617383311995,1982773491.2907202],[1617387963188,1985155493.939231],[1617391564495,1827213471.6221747],[1617395202777,1932891922.7380657],[1617398214973,1937931474.560893],[1617401809690,1961473630.4188676],[1617405699909,1952347409.661483],[1617409553080,2172811188.054834],[1617412963837,2431917537.219363],[1617416445822,2666886575.1140027],[1617420431122,2769520722.4907126],[1617422613890,2797409323.779513],[1617427393260,2895546310.6951184],[1617431058021,2894169435.883223],[1617433696700,2651591430.614699],[1617437513773,3448548871.8910036],[1617441138039,3537764498.5278754],[1617444820385,3662623380.0181885],[1617448128419,3729999481.3895626],[1617452094944,3741683833.307362],[1617457034540,3761774670.321721],[1617460631688,3809173022.555833],[1617464335978,3711591162.8519845],[1617467879738,3759143118.4621553],[1617471447610,3693936894.7524076],[1617474960418,3833857114.2069917],[1617478639837,3888109113.59996],[1617482233320,3857034438.9984646],[1617485821346,3898924734.2645984],[1617489477282,3952661186.2182713],[1617493109729,4002501827.9437523],[1617495709286,3872814933.0218143],[1617499443431,3939579930.8108554],[1617503699037,3663106636.5813146],[1617507443725,3808705623.491391],[1617510706891,3786240536.055139],[1617512446242,3717882675.3539762],[1617516040645,3722966733.2957063],[1617519813304,3482249884.952562],[1617523351916,3345586253.508183],[1617526909722,3327000473.8244348],[1617530664916,3181835266.2617188],[1617534176048,3094776290.1306324],[1617537924632,3064167829.684326],[1617541493704,3112790145.252149],[1617545018360,2989449570.670528],[1617548594506,3016965749.017692],[1617552471191,2973530338.557288],[1617555933696,2759208177.1915674],[1617559387440,2662906186.1813793],[1617563034515,2521716547.9565806],[1617566483711,2454800946.788864],[1617570325792,2412175803.4922743],[1617573668989,2381142461.766321],[1617577282876,2228904400.2017546],[1617580896737,2203439508.717633],[1617584514686,2083961834.3200803],[1617588367701,1922511436.832222],[1617591869391,1816453643.1859522],[1617595346098,1783362433.1356776],[1617599069131,1767878927.408502],[1617602711113,1782121869.0062866],[1617606278078,1784322317.8294444],[1617609891135,1785304724.1970084],[1617613319383,1792007217.4012969],[1617617302304,1808002080.6732872],[1617620901014,1821923720.87615],[1617624265084,1769426364.6123836],[1617629555312,1731155926.337212],[1617631504259,1735378701.9021676],[1617635133537,1942437073.2385755],[1617638780500,1938122743.6976163],[1617642119732,1932182393.8447528],[1617645707597,1918416705.3436842],[1617649325384,1925855235.7182896],[1617653252063,1944708214.0244768],[1617656889033,1932665022.73478],[1617660329160,1943687775.1192245],[1617663683699,1971924479.2343264],[1617667435208,2101421530.2666874],[1617672769205,2175322213.812557],[1617674524812,2168578229.7784457],[1617678186353,2149217571.1759067],[1617681915267,2132725563.885806],[1617685469475,1907950838.2268875],[1617689189705,2026223167.4473426],[1617692670953,1991840998.8517568],[1617696101989,1958389716.0448081],[1617699877898,2027665770.2623076],[1617703590445,2045913908.1590445],[1617707076556,2057724347.183567],[1617710622851,1722203248.9530182],[1617714225215,2160140597.446546],[1617717905528,2192080372.5552874],[1617721488585,2199844279.449877],[1617724918808,2244159138.5689125],[1617728548093,2263548854.897557],[1617732187891,2106855536.9938018],[1617735969816,2268365061.664965],[1617739538518,1863113060.588111],[1617742875565,2296819840.9881096],[1617746516853,2308037223.56185],[1617750327052,2297405821.9954567],[1617754017835,2215648462.217197],[1617758617023,2112353884.9607923],[1617761085616,2094123582.0260437],[1617764518134,2101292245.7045105],[1617768287923,2104106865.0792534],[1617771810289,2127056476.4717],[1617775566730,2152196953.3590703],[1617778865860,2160666464.579131],[1617782881414,2201171213.1865735],[1617786249160,2203934869.139618],[1617789807394,2329117281.806726],[1617793383957,2333039138.899913],[1617796986959,2491205752.3653517],[1617800521125,2652604590.3673797],[1617804331429,2692817000.168284],[1617807822435,2121796914.212729],[1617811418506,2538097921.330415],[1617815037057,2572049083.87979],[1617818698211,2550478468.4248347],[1617822347031,2541491737.3311806],[1617825969097,2609118564.630648],[1617829326876,2651351577.1099257],[1617833305171,2429954572.560337],[1617837011298,2435043578.3313527],[1617840572965,2394428204.082167],[1617843841041,2446826032.07983],[1617848315742,2395082349.188743],[1617850339793,2376349751.741466],[1617852591890,2385498650.2366877],[1617855126472,2380054416.699361],[1617858732962,2424505564.216302],[1617862619633,2434391633.272485],[1617865876330,2410962812.9744062],[1617869233838,2516114320.406959],[1617872539799,2437748581.3302546],[1617876293610,2247205079.171164],[1617880005259,2149347865.150653],[1617883394235,1893777066.5168178],[1617886836203,1757412804.559377],[1617892197847,1668727963.8671286],[1617894162445,1631584545.4824028],[1617897737215,1596293896.725426],[1617901282046,1525523967.3370435],[1617905003853,1370316987.26801],[1617908631874,1358993841.079183],[1617912335250,1404691449.9164236],[1617915995319,1379405950.1047523],[1617919567600,1366246502.7408085],[1617923270275,1289254721.1461022],[1617926622919,1386402238.6371279],[1617930228859,1384851642.1789908],[1617933705891,1365548610.2907162],[1617937372258,1357266138.9978309],[1617941122560,1335764096.6047564],[1617944870896,1322495289.1105938],[1617948462328,1283751933.8339043],[1617951863802,1272489837.990008],[1617955666499,1259096045.8789752],[1617958890026,1247182948.0102005],[1617962609987,1220448763.9536679],[1617966256703,1222538618.147044],[1617969964555,1148194206.4734476],[1617973333279,1199996169.7479842],[1617977646106,1154935691.529977],[1617980504476,1144564005.003322],[1617984273306,1132822242.6037295],[1617987925282,1136733019.0246003],[1617991396077,1139090847.1565342],[1617994822351,1133169530.4839995],[1617998615234,1113274570.5832539],[1618002141094,1094805189.6349592],[1618005876460,1035579604.067034],[1618009282025,1090335224.3969038],[1618013035782,1063984405.5106469],[1618016519119,1058097513.8615906],[1618020114108,1065381128.0365001]]}} 
368

When this code invoked it logs data correctly but i can not return it at last. Anyone can help?

ANSWER

Answered 2022-Apr-04 at 23:14

The problem appears to be that you're trying to return the entire Axios response. This cannot be serialised as JSON due to circular references.

Simply return the response data instead. You can also make your URL construction simpler (and safer) using the params option

1const functions = require("firebase-functions");
2const axios = require("axios");
3exports.getDatas = functions.https.onCall(async (d)=>{
4  functions.logger.log(d["name"]);
5  cname = d["name"];
6  ts1=d["ts1"];
7  ts2=d["ts2"];
8  const data =  await axios.get(
9    "https://api.coingecko.com/api/v3/coins/" +
10    cname +
11      "/market_chart/range?vs_currency=usd&from=" +
12      ts1 +
13      "&to=" +
14      ts2,
15  );
16  functions.logger.log(data);
17  return {data: data};
18});
19Unhandled error Error: Data cannot be encoded in JSON: function httpAdapter(config) {
20  return new Promise(function dispatchHttpRequest(resolvePromise, rejectPromise) {
21    var onCanceled;
22    function done() {
23      if (config.cancelToken) {
24        config.cancelToken.unsubscribe(onCanceled);
25      }
26
27      if (config.signal) {
28        config.signal.removeEventListener('abort', onCanceled);
29      }
30    }
31    var resolve = function resolve(value) {
32      done();
33      resolvePromise(value);
34    };
35    var rejected = false;
36    var reject = function reject(value) {
37      done();
38      rejected = true;
39      rejectPromise(value);
40    };
41    var data = config.data;
42    var headers = config.headers;
43    var headerNames = {};
44
45    Object.keys(headers).forEach(function storeLowerName(name) {
46      headerNames[name.toLowerCase()] = name;
47    });
48
49    // Set User-Agent (required by some servers)
50    // See https://github.com/axios/axios/issues/69
51    if ('user-agent' in headerNames) {
52      // User-Agent is specified; handle case where no UA header is desired
53      if (!headers[headerNames['user-agent']]) {
54        delete headers[headerNames['user-agent']];
55      }
56      // Otherwise, use specified value
57    } else {
58      // Only set header if it hasn't been set in config
59      headers['User-Agent'] = 'axios/' + VERSION;
60    }
61
62    if (data && !utils.isStream(data)) {
63      if (Buffer.isBuffer(data)) {
64        // Nothing to do...
65      } else if (utils.isArrayBuffer(data)) {
66        data = Buffer.from(new Uint8Array(data));
67      } else if (utils.isString(data)) {
68        data = Buffer.from(data, 'utf-8');
69      } else {
70        return reject(createError(
71          'Data after transformation must be a string, an ArrayBuffer, a Buffer, or a Stream',
72          config
73        ));
74      }
75
76      if (config.maxBodyLength > -1 && data.length > config.maxBodyLength) {
77        return reject(createError('Request body larger than maxBodyLength limit', config));
78      }
79
80      // Add Content-Length header if data exists
81      if (!headerNames['content-length']) {
82        headers['Content-Length'] = data.length;
83      }
84    }
85
86    // HTTP basic authentication
87    var auth = undefined;
88    if (config.auth) {
89      var username = config.auth.username || '';
90      var password = config.auth.password || '';
91      auth = username + ':' + password;
92    }
93
94    // Parse url
95    var fullPath = buildFullPath(config.baseURL, config.url);
96    var parsed = url.parse(fullPath);
97    var protocol = parsed.protocol || 'http:';
98
99    if (!auth && parsed.auth) {
100      var urlAuth = parsed.auth.split(':');
101      var urlUsername = urlAuth[0] || '';
102      var urlPassword = urlAuth[1] || '';
103      auth = urlUsername + ':' + urlPassword;
104    }
105
106    if (auth && headerNames.authorization) {
107      delete headers[headerNames.authorization];
108    }
109
110    var isHttpsRequest = isHttps.test(protocol);
111    var agent = isHttpsRequest ? config.httpsAgent : config.httpAgent;
112
113    var options = {
114      path: buildURL(parsed.path, config.params, config.paramsSerializer).replace(/^\?/, ''),
115      method: config.method.toUpperCase(),
116      headers: headers,
117      agent: agent,
118      agents: { http: config.httpAgent, https: config.httpsAgent },
119      auth: auth
120    };
121
122    if (config.socketPath) {
123      options.socketPath = config.socketPath;
124    } else {
125      options.hostname = parsed.hostname;
126      options.port = parsed.port;
127    }
128
129    var proxy = config.proxy;
130    if (!proxy && proxy !== false) {
131      var proxyEnv = protocol.slice(0, -1) + '_proxy';
132      var proxyUrl = process.env[proxyEnv] || process.env[proxyEnv.toUpperCase()];
133      if (proxyUrl) {
134        var parsedProxyUrl = url.parse(proxyUrl);
135        var noProxyEnv = process.env.no_proxy || process.env.NO_PROXY;
136        var shouldProxy = true;
137
138        if (noProxyEnv) {
139          var noProxy = noProxyEnv.split(',').map(function trim(s) {
140            return s.trim();
141          });
142
143          shouldProxy = !noProxy.some(function proxyMatch(proxyElement) {
144            if (!proxyElement) {
145              return false;
146            }
147            if (proxyElement === '*') {
148              return true;
149            }
150            if (proxyElement[0] === '.' &&
151                parsed.hostname.substr(parsed.hostname.length - proxyElement.length) === proxyElement) {
152              return true;
153            }
154
155            return parsed.hostname === proxyElement;
156          });
157        }
158
159        if (shouldProxy) {
160          proxy = {
161            host: parsedProxyUrl.hostname,
162            port: parsedProxyUrl.port,
163            protocol: parsedProxyUrl.protocol
164          };
165
166          if (parsedProxyUrl.auth) {
167            var proxyUrlAuth = parsedProxyUrl.auth.split(':');
168            proxy.auth = {
169              username: proxyUrlAuth[0],
170              password: proxyUrlAuth[1]
171            };
172          }
173        }
174      }
175    }
176
177    if (proxy) {
178      options.headers.host = parsed.hostname + (parsed.port ? ':' + parsed.port : '');
179      setProxy(options, proxy, protocol + '//' + parsed.hostname + (parsed.port ? ':' + parsed.port : '') + options.path);
180    }
181
182    var transport;
183    var isHttpsProxy = isHttpsRequest && (proxy ? isHttps.test(proxy.protocol) : true);
184    if (config.transport) {
185      transport = config.transport;
186    } else if (config.maxRedirects === 0) {
187      transport = isHttpsProxy ? https : http;
188    } else {
189      if (config.maxRedirects) {
190        options.maxRedirects = config.maxRedirects;
191      }
192      transport = isHttpsProxy ? httpsFollow : httpFollow;
193    }
194
195    if (config.maxBodyLength > -1) {
196      options.maxBodyLength = config.maxBodyLength;
197    }
198
199    if (config.insecureHTTPParser) {
200      options.insecureHTTPParser = config.insecureHTTPParser;
201    }
202
203    // Create the request
204    var req = transport.request(options, function handleResponse(res) {
205      if (req.aborted) return;
206
207      // uncompress the response body transparently if required
208      var stream = res;
209
210      // return the last request in case of redirects
211      var lastRequest = res.req || req;
212
213
214      // if no content, is HEAD request or decompress disabled we should not decompress
215      if (res.statusCode !== 204 && lastRequest.method !== 'HEAD' && config.decompress !== false) {
216        switch (res.headers['content-encoding']) {
217        /*eslint default-case:0*/
218        case 'gzip':
219        case 'compress':
220        case 'deflate':
221        // add the unzipper to the body stream processing pipeline
222          stream = stream.pipe(zlib.createUnzip());
223
224          // remove the content-encoding in order to not confuse downstream operations
225          delete res.headers['content-encoding'];
226          break;
227        }
228      }
229
230      var response = {
231        status: res.statusCode,
232        statusText: res.statusMessage,
233        headers: res.headers,
234        config: config,
235        request: lastRequest
236      };
237
238      if (config.responseType === 'stream') {
239        response.data = stream;
240        settle(resolve, reject, response);
241      } else {
242        var responseBuffer = [];
243        var totalResponseBytes = 0;
244        stream.on('data', function handleStreamData(chunk) {
245          responseBuffer.push(chunk);
246          totalResponseBytes += chunk.length;
247
248          // make sure the content length is not over the maxContentLength if specified
249          if (config.maxContentLength > -1 && totalResponseBytes > config.maxContentLength) {
250            // stream.destoy() emit aborted event before calling reject() on Node.js v16
251            rejected = true;
252            stream.destroy();
253            reject(createError('maxContentLength size of ' + config.maxContentLength + ' exceeded',
254              config, null, lastRequest));
255          }
256        });
257
258        stream.on('aborted', function handlerStreamAborted() {
259          if (rejected) {
260            return;
261          }
262          stream.destroy();
263          reject(createError('error request aborted', config, 'ERR_REQUEST_ABORTED', lastRequest));
264        });
265
266        stream.on('error', function handleStreamError(err) {
267          if (req.aborted) return;
268          reject(enhanceError(err, config, null, lastRequest));
269        });
270
271        stream.on('end', function handleStreamEnd() {
272          try {
273            var responseData = responseBuffer.length === 1 ? responseBuffer[0] : Buffer.concat(responseBuffer);
274            if (config.responseType !== 'arraybuffer') {
275              responseData = responseData.toString(config.responseEncoding);
276              if (!config.responseEncoding || config.responseEncoding === 'utf8') {
277                responseData = utils.stripBOM(responseData);
278              }
279            }
280            response.data = responseData;
281          } catch (err) {
282            reject(enhanceError(err, config, err.code, response.request, response));
283          }
284          settle(resolve, reject, response);
285        });
286      }
287    });
288
289    // Handle errors
290    req.on('error', function handleRequestError(err) {
291      if (req.aborted && err.code !== 'ERR_FR_TOO_MANY_REDIRECTS') return;
292      reject(enhanceError(err, config, null, req));
293    });
294
295    // set tcp keep alive to prevent drop connection by peer
296    req.on('socket', function handleRequestSocket(socket) {
297      // default interval of sending ack packet is 1 minute
298      socket.setKeepAlive(true, 1000 * 60);
299    });
300
301    // Handle request timeout
302    if (config.timeout) {
303      // This is forcing a int timeout to avoid problems if the `req` interface doesn't handle other types.
304      var timeout = parseInt(config.timeout, 10);
305
306      if (isNaN(timeout)) {
307        reject(createError(
308          'error trying to parse `config.timeout` to int',
309          config,
310          'ERR_PARSE_TIMEOUT',
311          req
312        ));
313
314        return;
315      }
316
317      // Sometime, the response will be very slow, and does not respond, the connect event will be block by event loop system.
318      // And timer callback will be fired, and abort() will be invoked before connection, then get "socket hang up" and code ECONNRESET.
319      // At this time, if we have a large number of request, nodejs will hang up some socket on background. and the number will up and up.
320      // And then these socket which be hang up will devoring CPU little by little.
321      // ClientRequest.setTimeout will be fired on the specify milliseconds, and can make sure that abort() will be fired after connect.
322      req.setTimeout(timeout, function handleRequestTimeout() {
323        req.abort();
324        var transitional = config.transitional || defaults.transitional;
325        reject(createError(
326          'timeout of ' + timeout + 'ms exceeded',
327          config,
328          transitional.clarifyTimeoutError ? 'ETIMEDOUT' : 'ECONNABORTED',
329          req
330        ));
331      });
332    }
333
334    if (config.cancelToken || config.signal) {
335      // Handle cancellation
336      // eslint-disable-next-line func-names
337      onCanceled = function(cancel) {
338        if (req.aborted) return;
339
340        req.abort();
341        reject(!cancel || (cancel && cancel.type) ? new Cancel('canceled') : cancel);
342      };
343
344      config.cancelToken && config.cancelToken.subscribe(onCanceled);
345      if (config.signal) {
346        config.signal.aborted ? onCanceled() : config.signal.addEventListener('abort', onCanceled);
347      }
348    }
349
350
351    // Send the request
352    if (utils.isStream(data)) {
353      data.on('error', function handleStreamError(err) {
354        reject(enhanceError(err, config, null, req));
355      }).pipe(req);
356    } else {
357      req.end(data);
358    }
359  });
360}
361    at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:162:11)
362    at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
363    at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
364    at encode (/workspace/node_modules/firebase-functions/lib/common/providers/https.js:156:22)
365    at /workspace/node_modules/firebase-functions/lib/common/providers/https.js:334:22
366    at processTicksAndRejections (internal/process/task_queues.js:97:5) 
367...["api.coingecko.com:443::::::::::::::::::"]},"keepAliveMsecs":1000,"maxFreeSockets":256,"scheduling":"fifo","keepAlive":false,"maxSockets":null},"_removedConnection":false,"writable":true},"status":200,"data":{"prices":[[1615345414698,37.27069164629981],[1615349310788,36.95627388647297],[1615352802175,37.48630338203377],[1615356202751,37.46442850999597],[1615360079361,37.642735963063906],[1615363905145,38.29435586902702],[1615367492353,38.313292928237594],[1615370461299,38.75503558097479],[1615374138056,38.24406575020552],[1615377815960,38.237026584388175],[1615381321332,38.93964664468625],[1615384813000,39.262646397955635],[1615388739874,39.15882057568881],[1615392094129,38.94488140309047],[1615395966875,38.79820936257378],[1615399312625,38.51637055616189],[1615403055037,38.59237008394828],[1615406529740,38.44087305010874],[1615410281814,37.71855645797291],[1615414278815,38.374824600586976],[1615417716420,38.4538669693684],[1615421045728,37.62772478442999],[1615425672990,36.8826465121472],[1615429587089,37.41958697414903],[1615432278494,37.34865694722488],[1615435254265,37.16289143388951],[1615439122292,37.14731463575248],[1615442523394,36.801517989796814],[1615446290102,37.02248224990424],[1615450361470,36.164787531097126],[1615453299572,36.46191265162147],[1615457172317,36.174755169666334],[1615460886498,37.05778010952229],[1615464298322,37.336909500902365],[1615469586325,37.56497212211488],[1615472126260,37.83046394206218],[1615474882979,37.252561357731096],[1615478498201,36.56190097084664],[1615482336185,36.83824760787625],[1615485957910,36.89351702770813],[1615489642151,37.589229946501746],[1615493390438,37.33184737771527],[1615496666244,37.29234576242379],[1615500577712,37.284260441548426],[1616866645601,1137195941.0307472],[1616870299925,1089416195.9864128],[1616873841648,1074341877.495249],[1616877368137,1061555457.3375872],[1616880970910,1077775411.1216433],[1616884693948,1064594490.6022671],[1616887998472,1087481667.611567],[1616891397951,1068140794.5197278],[1616894759953,1078753362.1719048],[1616898371565,1053546315.1245787],[1616902002474,1052498816.7223371],[1616905584364,1026915395.5541993],[1616909101481,1022271206.3215427],[1616912730390,997185793.1210617],[1616916434482,972130048.6316774],[1616919928611,988711196.2721183],[1616923534317,987299160.6191593],[1616926264719,975360472.6011684],[1616930074136,958327264.7346151],[1616933292776,935085970.8922312],[1616936940791,896217168.3654604],[1616940936847,878876312.6707534],[1616944090304,890504985.5476977],[1616948321869,896715385.5657766],[1616952007508,870767231.0865391],[1616955544207,880601758.4610194],[1616958381375,896794852.1077055],[1616962022167,929362788.5783823],[1616966479654,927502494.4691795],[1616969648773,880385481.5284289],[1616973545649,862329007.9935848],[1616977463095,840138544.6360805],[1616980359587,849727926.595521],[1616984356096,820616225.3306137],[1616987602367,898085663.0760688],[1616990444958,890215727.4112909],[1616995470635,914823340.6343507],[1616999032159,890922230.685704],[1617002651977,937214914.0703756],[1617005329558,976030203.3879734],[1617009370471,1061898884.4388478],[1617012348377,1111994349.2592206],[1617015705482,1175310227.1595278],[1617019895549,1217044915.3900926],[1617022941451,1204239369.9336267],[1617027118715,1225123359.1178432],[1617031210170,1191418570.9198012],[1617033728601,1257085051.9742537],[1617037882992,1261291734.3667347],[1617041858553,1265805909.4506621],[1617044547418,1261869965.5784621],[1617049418534,1225924891.220988],[1617052450394,1200646247.466799],[1617055896172,1209247034.0807025],[1617059684123,1249662106.3996315],[1617062561979,837849935.5380555],[1617066155823,1261094295.2039979],[1617070572708,1244044711.3556864],[1617074210159,1178503497.252399],[1617077106612,1184744920.254339],[1617080571662,1219164970.9205332],[1617084836477,1174744890.1399443],[1617087739776,1236332180.5454476],[1617092763739,1121685108.4046226],[1617096303391,1074005978.1362224],[1617100013739,1075898891.906641],[1617102136947,1041120230.0169744],[1617106411165,1021062028.7444541],[1617110588848,1004207600.6385714],[1617114148509,983098685.435342],[1617117449987,983878432.6976557],[1617120868725,943893192.0239582],[1617123806180,948379973.8680001],[1617128347360,948328240.0510467],[1617131244094,923477307.6495335],[1617134866719,918321070.6284192],[1617138697011,960178009.2986945],[1617142067857,974105207.7725881],[1617146083923,973959760.0729104],[1617149999086,959500047.5209063],[1617153094367,1007753562.6156206],[1617156698445,1021534121.1115336],[1617160175611,1028067427.0339341],[1617163928330,1007755251.8882328],[1617166924538,1023240773.0466446],[1617171886674,1037535813.1806505],[1617175133694,1101375379.7094195],[1617178435173,1136688478.90344],[1617182857658,1208366620.2561867],[1617185353773,1208823054.3509212],[1617188828477,1234197192.568771],[1617193393471,1707076315.380663],[1617196301983,1845668637.7358408],[1617199516026,1901877634.1385415],[1617203681947,2015292037.1305778],[1617207515426,2141098631.115179],[1617210224998,2343473154.2871637],[1617214323265,2329074198.4966955],[1617217968405,2461828129.1798186],[1617221653017,2493042958.539376],[1617224582971,2532015555.7692595],[1617228589364,2508661361.110037],[1617232204720,2590057969.924583],[1617235260464,2749780924.550207],[1617239367664,2791689438.967896],[1617243152558,2778422749.5901804],[1617246573894,2802892972.2612605],[1617250114952,2795446026.902383],[1617253276300,2837092221.188881],[1617257741390,2957061611.281718],[1617261111556,3025594776.954216],[1617264301698,3140730366.12618],[1617267704421,3230797741.627739],[1617272276500,3247001347.7404704],[1617275862720,3182990384.8873067],[1617279129292,2889317168.9977646],[1617283053665,2753527702.506779],[1617287046529,2700392654.8781624],[1617290204012,2616296684.424929],[1617293298853,2494255828.9768047],[1617296557242,2383424694.8900166],[1617301325511,2288268623.177356],[1617303766777,2297155897.636895],[1617307669347,2314935325.319679],[1617311721980,2259716784.056617],[1617314946823,2267889595.9127536],[1617319572007,2174169254.528509],[1617323182318,2097690604.8152165],[1617326033792,2110975746.1916978],[1617329489226,2126100629.800452],[1617332409284,2193182655.044224],[1617337211709,2199847063.5248647],[1617340611316,2167549077.601362],[1617344146863,2110348803.8388174],[1617347361962,2023115590.5637138],[1617351380142,1864316761.5098753],[1617354151186,1788973202.0040677],[1617359277447,1731207666.0376515],[1617361312976,1418566500.3106787],[1617366169158,1693859181.5518322],[1617369860769,1656689094.290342],[1617372306072,1660176536.7450612],[1617376754676,1722154482.4234965],[1617379285817,1915067128.493045],[1617383311995,1982773491.2907202],[1617387963188,1985155493.939231],[1617391564495,1827213471.6221747],[1617395202777,1932891922.7380657],[1617398214973,1937931474.560893],[1617401809690,1961473630.4188676],[1617405699909,1952347409.661483],[1617409553080,2172811188.054834],[1617412963837,2431917537.219363],[1617416445822,2666886575.1140027],[1617420431122,2769520722.4907126],[1617422613890,2797409323.779513],[1617427393260,2895546310.6951184],[1617431058021,2894169435.883223],[1617433696700,2651591430.614699],[1617437513773,3448548871.8910036],[1617441138039,3537764498.5278754],[1617444820385,3662623380.0181885],[1617448128419,3729999481.3895626],[1617452094944,3741683833.307362],[1617457034540,3761774670.321721],[1617460631688,3809173022.555833],[1617464335978,3711591162.8519845],[1617467879738,3759143118.4621553],[1617471447610,3693936894.7524076],[1617474960418,3833857114.2069917],[1617478639837,3888109113.59996],[1617482233320,3857034438.9984646],[1617485821346,3898924734.2645984],[1617489477282,3952661186.2182713],[1617493109729,4002501827.9437523],[1617495709286,3872814933.0218143],[1617499443431,3939579930.8108554],[1617503699037,3663106636.5813146],[1617507443725,3808705623.491391],[1617510706891,3786240536.055139],[1617512446242,3717882675.3539762],[1617516040645,3722966733.2957063],[1617519813304,3482249884.952562],[1617523351916,3345586253.508183],[1617526909722,3327000473.8244348],[1617530664916,3181835266.2617188],[1617534176048,3094776290.1306324],[1617537924632,3064167829.684326],[1617541493704,3112790145.252149],[1617545018360,2989449570.670528],[1617548594506,3016965749.017692],[1617552471191,2973530338.557288],[1617555933696,2759208177.1915674],[1617559387440,2662906186.1813793],[1617563034515,2521716547.9565806],[1617566483711,2454800946.788864],[1617570325792,2412175803.4922743],[1617573668989,2381142461.766321],[1617577282876,2228904400.2017546],[1617580896737,2203439508.717633],[1617584514686,2083961834.3200803],[1617588367701,1922511436.832222],[1617591869391,1816453643.1859522],[1617595346098,1783362433.1356776],[1617599069131,1767878927.408502],[1617602711113,1782121869.0062866],[1617606278078,1784322317.8294444],[1617609891135,1785304724.1970084],[1617613319383,1792007217.4012969],[1617617302304,1808002080.6732872],[1617620901014,1821923720.87615],[1617624265084,1769426364.6123836],[1617629555312,1731155926.337212],[1617631504259,1735378701.9021676],[1617635133537,1942437073.2385755],[1617638780500,1938122743.6976163],[1617642119732,1932182393.8447528],[1617645707597,1918416705.3436842],[1617649325384,1925855235.7182896],[1617653252063,1944708214.0244768],[1617656889033,1932665022.73478],[1617660329160,1943687775.1192245],[1617663683699,1971924479.2343264],[1617667435208,2101421530.2666874],[1617672769205,2175322213.812557],[1617674524812,2168578229.7784457],[1617678186353,2149217571.1759067],[1617681915267,2132725563.885806],[1617685469475,1907950838.2268875],[1617689189705,2026223167.4473426],[1617692670953,1991840998.8517568],[1617696101989,1958389716.0448081],[1617699877898,2027665770.2623076],[1617703590445,2045913908.1590445],[1617707076556,2057724347.183567],[1617710622851,1722203248.9530182],[1617714225215,2160140597.446546],[1617717905528,2192080372.5552874],[1617721488585,2199844279.449877],[1617724918808,2244159138.5689125],[1617728548093,2263548854.897557],[1617732187891,2106855536.9938018],[1617735969816,2268365061.664965],[1617739538518,1863113060.588111],[1617742875565,2296819840.9881096],[1617746516853,2308037223.56185],[1617750327052,2297405821.9954567],[1617754017835,2215648462.217197],[1617758617023,2112353884.9607923],[1617761085616,2094123582.0260437],[1617764518134,2101292245.7045105],[1617768287923,2104106865.0792534],[1617771810289,2127056476.4717],[1617775566730,2152196953.3590703],[1617778865860,2160666464.579131],[1617782881414,2201171213.1865735],[1617786249160,2203934869.139618],[1617789807394,2329117281.806726],[1617793383957,2333039138.899913],[1617796986959,2491205752.3653517],[1617800521125,2652604590.3673797],[1617804331429,2692817000.168284],[1617807822435,2121796914.212729],[1617811418506,2538097921.330415],[1617815037057,2572049083.87979],[1617818698211,2550478468.4248347],[1617822347031,2541491737.3311806],[1617825969097,2609118564.630648],[1617829326876,2651351577.1099257],[1617833305171,2429954572.560337],[1617837011298,2435043578.3313527],[1617840572965,2394428204.082167],[1617843841041,2446826032.07983],[1617848315742,2395082349.188743],[1617850339793,2376349751.741466],[1617852591890,2385498650.2366877],[1617855126472,2380054416.699361],[1617858732962,2424505564.216302],[1617862619633,2434391633.272485],[1617865876330,2410962812.9744062],[1617869233838,2516114320.406959],[1617872539799,2437748581.3302546],[1617876293610,2247205079.171164],[1617880005259,2149347865.150653],[1617883394235,1893777066.5168178],[1617886836203,1757412804.559377],[1617892197847,1668727963.8671286],[1617894162445,1631584545.4824028],[1617897737215,1596293896.725426],[1617901282046,1525523967.3370435],[1617905003853,1370316987.26801],[1617908631874,1358993841.079183],[1617912335250,1404691449.9164236],[1617915995319,1379405950.1047523],[1617919567600,1366246502.7408085],[1617923270275,1289254721.1461022],[1617926622919,1386402238.6371279],[1617930228859,1384851642.1789908],[1617933705891,1365548610.2907162],[1617937372258,1357266138.9978309],[1617941122560,1335764096.6047564],[1617944870896,1322495289.1105938],[1617948462328,1283751933.8339043],[1617951863802,1272489837.990008],[1617955666499,1259096045.8789752],[1617958890026,1247182948.0102005],[1617962609987,1220448763.9536679],[1617966256703,1222538618.147044],[1617969964555,1148194206.4734476],[1617973333279,1199996169.7479842],[1617977646106,1154935691.529977],[1617980504476,1144564005.003322],[1617984273306,1132822242.6037295],[1617987925282,1136733019.0246003],[1617991396077,1139090847.1565342],[1617994822351,1133169530.4839995],[1617998615234,1113274570.5832539],[1618002141094,1094805189.6349592],[1618005876460,1035579604.067034],[1618009282025,1090335224.3969038],[1618013035782,1063984405.5106469],[1618016519119,1058097513.8615906],[1618020114108,1065381128.0365001]]}} 
368exports.getDatas = functions.https.onCall(async ({ name, ts1, ts2 }) => {
369  functions.logger.log(name);
370  //    👇 note the destructure here
371  const { data } = await axios.get(
372    `https://api.coingecko.com/api/v3/coins/${encodeURIComponent(name)}/market_chart/range`,
373    {
374      params: {
375        vs_currency: "usd",
376        from: ts1,
377        to: ts2,
378      }
379    }
380  );
381  functions.logger.log(data);
382  return { data };
383});
384

Source https://stackoverflow.com/questions/71743927

QUESTION

Flink missing windows generated on some partitions

Asked 2022-Feb-14 at 20:51

I am trying to write a small Flink dataflow to understand more how it works and I am facing a strange situation where each time I run it, I am getting inconsistent outputs. Sometimes some records that I am expecting are missing. Keep in mind this is just a toy example I am building to learn the concepts of the DataStream API.

I have a dataset of around 7600 rows in CSV format like that look like this:

1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7

Full dataset here: https://pastebin.com/rknnRnPc

There are no special characters or quotes, so a simple String split will work fine.

The date range for each city spans from 28/06/2021 to 03/10/2021.

I am reading it using the DataStream API:

final DataStream<String> source = env.readTextFile("data.csv");

Each row is mapped to a simple POJO as follows:

1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9    private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern(&quot;dd/MM/yyyy&quot;);
10
11    private final LocalDate localDate;
12    private final String country;
13    private final String city;
14    private final String reading;
15    private final int count;
16    private final double min;
17    private final double max;
18    private final double median;
19    private final double variance;
20
21    private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22        this.localDate = localDate;
23        this.country = country;
24        this.city = city;
25        this.reading = reading;
26        this.count = count;
27        this.min = min;
28        this.max = max;
29        this.median = median;
30        this.variance = variance;
31    }
32
33    public static CityMetric fromArray(String[] arr) {
34        LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35        int count = Integer.parseInt(arr[4]);
36        double min = Double.parseDouble(arr[5]);
37        double max = Double.parseDouble(arr[6]);
38        double median = Double.parseDouble(arr[7]);
39        double variance = Double.parseDouble(arr[8]);
40
41        return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42    }
43
44    public long getTimestamp() {
45        return getLocalDate()
46                .atStartOfDay()
47                .toInstant(ZoneOffset.UTC)
48                .toEpochMilli();
49    }
50
51//getters follow
52

The records are all in order of date, so I have this to set the event time and watermark:

1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9    private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern(&quot;dd/MM/yyyy&quot;);
10
11    private final LocalDate localDate;
12    private final String country;
13    private final String city;
14    private final String reading;
15    private final int count;
16    private final double min;
17    private final double max;
18    private final double median;
19    private final double variance;
20
21    private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22        this.localDate = localDate;
23        this.country = country;
24        this.city = city;
25        this.reading = reading;
26        this.count = count;
27        this.min = min;
28        this.max = max;
29        this.median = median;
30        this.variance = variance;
31    }
32
33    public static CityMetric fromArray(String[] arr) {
34        LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35        int count = Integer.parseInt(arr[4]);
36        double min = Double.parseDouble(arr[5]);
37        double max = Double.parseDouble(arr[6]);
38        double median = Double.parseDouble(arr[7]);
39        double variance = Double.parseDouble(arr[8]);
40
41        return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42    }
43
44    public long getTimestamp() {
45        return getLocalDate()
46                .atStartOfDay()
47                .toInstant(ZoneOffset.UTC)
48                .toEpochMilli();
49    }
50
51//getters follow
52   final WatermarkStrategy&lt;CityMetric&gt; cityMetricWatermarkStrategy =
53            WatermarkStrategy.&lt;CityMetric&gt;forMonotonousTimestamps()  //we know they are sorted by time
54                    .withTimestampAssigner((cityMetric, l) -&gt; cityMetric.getTimestamp());
55

I have a StreamingFileSink on a Tuple4 to output the date range, city and average:

1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9    private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern(&quot;dd/MM/yyyy&quot;);
10
11    private final LocalDate localDate;
12    private final String country;
13    private final String city;
14    private final String reading;
15    private final int count;
16    private final double min;
17    private final double max;
18    private final double median;
19    private final double variance;
20
21    private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22        this.localDate = localDate;
23        this.country = country;
24        this.city = city;
25        this.reading = reading;
26        this.count = count;
27        this.min = min;
28        this.max = max;
29        this.median = median;
30        this.variance = variance;
31    }
32
33    public static CityMetric fromArray(String[] arr) {
34        LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35        int count = Integer.parseInt(arr[4]);
36        double min = Double.parseDouble(arr[5]);
37        double max = Double.parseDouble(arr[6]);
38        double median = Double.parseDouble(arr[7]);
39        double variance = Double.parseDouble(arr[8]);
40
41        return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42    }
43
44    public long getTimestamp() {
45        return getLocalDate()
46                .atStartOfDay()
47                .toInstant(ZoneOffset.UTC)
48                .toEpochMilli();
49    }
50
51//getters follow
52   final WatermarkStrategy&lt;CityMetric&gt; cityMetricWatermarkStrategy =
53            WatermarkStrategy.&lt;CityMetric&gt;forMonotonousTimestamps()  //we know they are sorted by time
54                    .withTimestampAssigner((cityMetric, l) -&gt; cityMetric.getTimestamp());
55  final StreamingFileSink&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt; fileSink =
56        StreamingFileSink.forRowFormat(
57                new Path(&quot;airquality&quot;),
58                new SimpleStringEncoder&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt;(&quot;UTF-8&quot;))
59            .build();
60

And finally I have the dataflow as follows:

1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9    private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern(&quot;dd/MM/yyyy&quot;);
10
11    private final LocalDate localDate;
12    private final String country;
13    private final String city;
14    private final String reading;
15    private final int count;
16    private final double min;
17    private final double max;
18    private final double median;
19    private final double variance;
20
21    private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22        this.localDate = localDate;
23        this.country = country;
24        this.city = city;
25        this.reading = reading;
26        this.count = count;
27        this.min = min;
28        this.max = max;
29        this.median = median;
30        this.variance = variance;
31    }
32
33    public static CityMetric fromArray(String[] arr) {
34        LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35        int count = Integer.parseInt(arr[4]);
36        double min = Double.parseDouble(arr[5]);
37        double max = Double.parseDouble(arr[6]);
38        double median = Double.parseDouble(arr[7]);
39        double variance = Double.parseDouble(arr[8]);
40
41        return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42    }
43
44    public long getTimestamp() {
45        return getLocalDate()
46                .atStartOfDay()
47                .toInstant(ZoneOffset.UTC)
48                .toEpochMilli();
49    }
50
51//getters follow
52   final WatermarkStrategy&lt;CityMetric&gt; cityMetricWatermarkStrategy =
53            WatermarkStrategy.&lt;CityMetric&gt;forMonotonousTimestamps()  //we know they are sorted by time
54                    .withTimestampAssigner((cityMetric, l) -&gt; cityMetric.getTimestamp());
55  final StreamingFileSink&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt; fileSink =
56        StreamingFileSink.forRowFormat(
57                new Path(&quot;airquality&quot;),
58                new SimpleStringEncoder&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt;(&quot;UTF-8&quot;))
59            .build();
60 source
61        .map(s -&gt; s.split(&quot;,&quot;)) //split the CSV row into its fields
62        .filter(arr -&gt; !arr[0].startsWith(&quot;Date&quot;)) // if it starts with Date it means it is the top header
63        .map(CityMetric::fromArray)  //create the object from the fields
64        .assignTimestampsAndWatermarks(cityMetricWatermarkStrategy) // we use the date as the event time
65        .filter(cm -&gt; cm.getReading().equals(&quot;pm25&quot;)) // we want air quality of fine particulate matter pm2.5
66        .keyBy(CityMetric::getCity) // partition by city name
67        .window(TumblingEventTimeWindows.of(Time.days(7))) //windows of 7 days
68        .aggregate(new CityAverageAggregate()) // average the values
69        .name(&quot;cityair&quot;)
70        .addSink(fileSink); //output each partition to a file
71

The CityAverageAggregate just accumulates the sum and count, and keeps track of the earliest and latest dates of the range it is covering.

1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9    private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern(&quot;dd/MM/yyyy&quot;);
10
11    private final LocalDate localDate;
12    private final String country;
13    private final String city;
14    private final String reading;
15    private final int count;
16    private final double min;
17    private final double max;
18    private final double median;
19    private final double variance;
20
21    private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22        this.localDate = localDate;
23        this.country = country;
24        this.city = city;
25        this.reading = reading;
26        this.count = count;
27        this.min = min;
28        this.max = max;
29        this.median = median;
30        this.variance = variance;
31    }
32
33    public static CityMetric fromArray(String[] arr) {
34        LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35        int count = Integer.parseInt(arr[4]);
36        double min = Double.parseDouble(arr[5]);
37        double max = Double.parseDouble(arr[6]);
38        double median = Double.parseDouble(arr[7]);
39        double variance = Double.parseDouble(arr[8]);
40
41        return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42    }
43
44    public long getTimestamp() {
45        return getLocalDate()
46                .atStartOfDay()
47                .toInstant(ZoneOffset.UTC)
48                .toEpochMilli();
49    }
50
51//getters follow
52   final WatermarkStrategy&lt;CityMetric&gt; cityMetricWatermarkStrategy =
53            WatermarkStrategy.&lt;CityMetric&gt;forMonotonousTimestamps()  //we know they are sorted by time
54                    .withTimestampAssigner((cityMetric, l) -&gt; cityMetric.getTimestamp());
55  final StreamingFileSink&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt; fileSink =
56        StreamingFileSink.forRowFormat(
57                new Path(&quot;airquality&quot;),
58                new SimpleStringEncoder&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt;(&quot;UTF-8&quot;))
59            .build();
60 source
61        .map(s -&gt; s.split(&quot;,&quot;)) //split the CSV row into its fields
62        .filter(arr -&gt; !arr[0].startsWith(&quot;Date&quot;)) // if it starts with Date it means it is the top header
63        .map(CityMetric::fromArray)  //create the object from the fields
64        .assignTimestampsAndWatermarks(cityMetricWatermarkStrategy) // we use the date as the event time
65        .filter(cm -&gt; cm.getReading().equals(&quot;pm25&quot;)) // we want air quality of fine particulate matter pm2.5
66        .keyBy(CityMetric::getCity) // partition by city name
67        .window(TumblingEventTimeWindows.of(Time.days(7))) //windows of 7 days
68        .aggregate(new CityAverageAggregate()) // average the values
69        .name(&quot;cityair&quot;)
70        .addSink(fileSink); //output each partition to a file
71public class CityAverageAggregate
72    implements AggregateFunction&lt;
73        CityMetric, CityAverageAggregate.AverageAccumulator, Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt; {
74
75  @Override
76  public AverageAccumulator createAccumulator() {
77    return new AverageAccumulator();
78  }
79
80  @Override
81  public AverageAccumulator add(CityMetric cityMetric, AverageAccumulator averageAccumulator) {
82    return averageAccumulator.add(
83        cityMetric.getCity(), cityMetric.getLocalDate(), cityMetric.getMedian());
84  }
85
86  @Override
87  public Tuple4&lt;LocalDate, LocalDate, String, Double&gt; getResult(
88      AverageAccumulator averageAccumulator) {
89    return Tuple4.of(
90        averageAccumulator.getStart(),
91        averageAccumulator.getEnd(),
92        averageAccumulator.getCity(),
93        averageAccumulator.average());
94  }
95
96  @Override
97  public AverageAccumulator merge(AverageAccumulator acc1, AverageAccumulator acc2) {
98    return acc1.merge(acc2);
99  }
100
101  public static class AverageAccumulator {
102    private final String city;
103    private final LocalDate start;
104    private final LocalDate end;
105    private final long count;
106    private final double sum;
107
108    public AverageAccumulator() {
109      city = &quot;&quot;;
110      count = 0;
111      sum = 0;
112      start = null;
113      end = null;
114    }
115
116    AverageAccumulator(String city, LocalDate start, LocalDate end, long count, double sum) {
117      this.city = city;
118      this.count = count;
119      this.sum = sum;
120      this.start = start;
121      this.end = end;
122    }
123
124    public AverageAccumulator add(String city, LocalDate eventDate, double value) {
125      //make sure our dataflow is correct and we are summing data from the same city
126      if (!this.city.equals(&quot;&quot;) &amp;&amp; !this.city.equals(city)) {
127        throw new IllegalArgumentException(city + &quot; does not match &quot; + this.city);
128      }
129
130      return new AverageAccumulator(
131          city,
132          earliest(this.start, eventDate),
133          latest(this.end, eventDate),
134          this.count + 1,
135          this.sum + value);
136    }
137
138    public AverageAccumulator merge(AverageAccumulator that) {
139      LocalDate mergedStart = earliest(this.start, that.start);
140      LocalDate mergedEnd = latest(this.end, that.end);
141      return new AverageAccumulator(
142          this.city, mergedStart, mergedEnd, this.count + that.count, this.sum + that.sum);
143    }
144
145    private LocalDate earliest(LocalDate d1, LocalDate d2) {
146      if (d1 == null) {
147        return d2;
148      } else if (d2 == null) {
149        return d1;
150      } else {
151        return d1.isBefore(d2) ? d1 : d2;
152      }
153    }
154
155    private LocalDate latest(LocalDate d1, LocalDate d2) {
156      if (d1 == null) {
157        return d2;
158      } else if (d2 == null) {
159        return d1;
160      } else {
161        return d1.isAfter(d2) ? d1 : d2;
162      }
163    }
164
165    public double average() {
166      return sum / count;
167    }
168
169    public String getCity() {
170      return city;
171    }
172
173    public LocalDate getStart() {
174      return start;
175    }
176
177    public LocalDate getEnd() {
178      return end;
179    }
180  }
181}
182

Problem:

The problem I am facing is that sometimes I do not get all the windows I am expecting. This does not always happen, sometimes consecutive runs output a different result, so I am suspecting there is some race condition somewhere.

For example, in one of the partition file output I sometimes get:

1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9    private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern(&quot;dd/MM/yyyy&quot;);
10
11    private final LocalDate localDate;
12    private final String country;
13    private final String city;
14    private final String reading;
15    private final int count;
16    private final double min;
17    private final double max;
18    private final double median;
19    private final double variance;
20
21    private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22        this.localDate = localDate;
23        this.country = country;
24        this.city = city;
25        this.reading = reading;
26        this.count = count;
27        this.min = min;
28        this.max = max;
29        this.median = median;
30        this.variance = variance;
31    }
32
33    public static CityMetric fromArray(String[] arr) {
34        LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35        int count = Integer.parseInt(arr[4]);
36        double min = Double.parseDouble(arr[5]);
37        double max = Double.parseDouble(arr[6]);
38        double median = Double.parseDouble(arr[7]);
39        double variance = Double.parseDouble(arr[8]);
40
41        return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42    }
43
44    public long getTimestamp() {
45        return getLocalDate()
46                .atStartOfDay()
47                .toInstant(ZoneOffset.UTC)
48                .toEpochMilli();
49    }
50
51//getters follow
52   final WatermarkStrategy&lt;CityMetric&gt; cityMetricWatermarkStrategy =
53            WatermarkStrategy.&lt;CityMetric&gt;forMonotonousTimestamps()  //we know they are sorted by time
54                    .withTimestampAssigner((cityMetric, l) -&gt; cityMetric.getTimestamp());
55  final StreamingFileSink&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt; fileSink =
56        StreamingFileSink.forRowFormat(
57                new Path(&quot;airquality&quot;),
58                new SimpleStringEncoder&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt;(&quot;UTF-8&quot;))
59            .build();
60 source
61        .map(s -&gt; s.split(&quot;,&quot;)) //split the CSV row into its fields
62        .filter(arr -&gt; !arr[0].startsWith(&quot;Date&quot;)) // if it starts with Date it means it is the top header
63        .map(CityMetric::fromArray)  //create the object from the fields
64        .assignTimestampsAndWatermarks(cityMetricWatermarkStrategy) // we use the date as the event time
65        .filter(cm -&gt; cm.getReading().equals(&quot;pm25&quot;)) // we want air quality of fine particulate matter pm2.5
66        .keyBy(CityMetric::getCity) // partition by city name
67        .window(TumblingEventTimeWindows.of(Time.days(7))) //windows of 7 days
68        .aggregate(new CityAverageAggregate()) // average the values
69        .name(&quot;cityair&quot;)
70        .addSink(fileSink); //output each partition to a file
71public class CityAverageAggregate
72    implements AggregateFunction&lt;
73        CityMetric, CityAverageAggregate.AverageAccumulator, Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt; {
74
75  @Override
76  public AverageAccumulator createAccumulator() {
77    return new AverageAccumulator();
78  }
79
80  @Override
81  public AverageAccumulator add(CityMetric cityMetric, AverageAccumulator averageAccumulator) {
82    return averageAccumulator.add(
83        cityMetric.getCity(), cityMetric.getLocalDate(), cityMetric.getMedian());
84  }
85
86  @Override
87  public Tuple4&lt;LocalDate, LocalDate, String, Double&gt; getResult(
88      AverageAccumulator averageAccumulator) {
89    return Tuple4.of(
90        averageAccumulator.getStart(),
91        averageAccumulator.getEnd(),
92        averageAccumulator.getCity(),
93        averageAccumulator.average());
94  }
95
96  @Override
97  public AverageAccumulator merge(AverageAccumulator acc1, AverageAccumulator acc2) {
98    return acc1.merge(acc2);
99  }
100
101  public static class AverageAccumulator {
102    private final String city;
103    private final LocalDate start;
104    private final LocalDate end;
105    private final long count;
106    private final double sum;
107
108    public AverageAccumulator() {
109      city = &quot;&quot;;
110      count = 0;
111      sum = 0;
112      start = null;
113      end = null;
114    }
115
116    AverageAccumulator(String city, LocalDate start, LocalDate end, long count, double sum) {
117      this.city = city;
118      this.count = count;
119      this.sum = sum;
120      this.start = start;
121      this.end = end;
122    }
123
124    public AverageAccumulator add(String city, LocalDate eventDate, double value) {
125      //make sure our dataflow is correct and we are summing data from the same city
126      if (!this.city.equals(&quot;&quot;) &amp;&amp; !this.city.equals(city)) {
127        throw new IllegalArgumentException(city + &quot; does not match &quot; + this.city);
128      }
129
130      return new AverageAccumulator(
131          city,
132          earliest(this.start, eventDate),
133          latest(this.end, eventDate),
134          this.count + 1,
135          this.sum + value);
136    }
137
138    public AverageAccumulator merge(AverageAccumulator that) {
139      LocalDate mergedStart = earliest(this.start, that.start);
140      LocalDate mergedEnd = latest(this.end, that.end);
141      return new AverageAccumulator(
142          this.city, mergedStart, mergedEnd, this.count + that.count, this.sum + that.sum);
143    }
144
145    private LocalDate earliest(LocalDate d1, LocalDate d2) {
146      if (d1 == null) {
147        return d2;
148      } else if (d2 == null) {
149        return d1;
150      } else {
151        return d1.isBefore(d2) ? d1 : d2;
152      }
153    }
154
155    private LocalDate latest(LocalDate d1, LocalDate d2) {
156      if (d1 == null) {
157        return d2;
158      } else if (d2 == null) {
159        return d1;
160      } else {
161        return d1.isAfter(d2) ? d1 : d2;
162      }
163    }
164
165    public double average() {
166      return sum / count;
167    }
168
169    public String getCity() {
170      return city;
171    }
172
173    public LocalDate getStart() {
174      return start;
175    }
176
177    public LocalDate getEnd() {
178      return end;
179    }
180  }
181}
182(2021-07-12,2021-07-14,Belgrade,56.666666666666664)
183(2021-07-15,2021-07-21,Belgrade,56.0)
184(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
185(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
186(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
187(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
188(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
189(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
190(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
191(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
192(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
193(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
194(2021-09-30,2021-10-03,Belgrade,61.5)
195

While sometimes I get the full set:

1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9    private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern(&quot;dd/MM/yyyy&quot;);
10
11    private final LocalDate localDate;
12    private final String country;
13    private final String city;
14    private final String reading;
15    private final int count;
16    private final double min;
17    private final double max;
18    private final double median;
19    private final double variance;
20
21    private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22        this.localDate = localDate;
23        this.country = country;
24        this.city = city;
25        this.reading = reading;
26        this.count = count;
27        this.min = min;
28        this.max = max;
29        this.median = median;
30        this.variance = variance;
31    }
32
33    public static CityMetric fromArray(String[] arr) {
34        LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35        int count = Integer.parseInt(arr[4]);
36        double min = Double.parseDouble(arr[5]);
37        double max = Double.parseDouble(arr[6]);
38        double median = Double.parseDouble(arr[7]);
39        double variance = Double.parseDouble(arr[8]);
40
41        return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42    }
43
44    public long getTimestamp() {
45        return getLocalDate()
46                .atStartOfDay()
47                .toInstant(ZoneOffset.UTC)
48                .toEpochMilli();
49    }
50
51//getters follow
52   final WatermarkStrategy&lt;CityMetric&gt; cityMetricWatermarkStrategy =
53            WatermarkStrategy.&lt;CityMetric&gt;forMonotonousTimestamps()  //we know they are sorted by time
54                    .withTimestampAssigner((cityMetric, l) -&gt; cityMetric.getTimestamp());
55  final StreamingFileSink&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt; fileSink =
56        StreamingFileSink.forRowFormat(
57                new Path(&quot;airquality&quot;),
58                new SimpleStringEncoder&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt;(&quot;UTF-8&quot;))
59            .build();
60 source
61        .map(s -&gt; s.split(&quot;,&quot;)) //split the CSV row into its fields
62        .filter(arr -&gt; !arr[0].startsWith(&quot;Date&quot;)) // if it starts with Date it means it is the top header
63        .map(CityMetric::fromArray)  //create the object from the fields
64        .assignTimestampsAndWatermarks(cityMetricWatermarkStrategy) // we use the date as the event time
65        .filter(cm -&gt; cm.getReading().equals(&quot;pm25&quot;)) // we want air quality of fine particulate matter pm2.5
66        .keyBy(CityMetric::getCity) // partition by city name
67        .window(TumblingEventTimeWindows.of(Time.days(7))) //windows of 7 days
68        .aggregate(new CityAverageAggregate()) // average the values
69        .name(&quot;cityair&quot;)
70        .addSink(fileSink); //output each partition to a file
71public class CityAverageAggregate
72    implements AggregateFunction&lt;
73        CityMetric, CityAverageAggregate.AverageAccumulator, Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt; {
74
75  @Override
76  public AverageAccumulator createAccumulator() {
77    return new AverageAccumulator();
78  }
79
80  @Override
81  public AverageAccumulator add(CityMetric cityMetric, AverageAccumulator averageAccumulator) {
82    return averageAccumulator.add(
83        cityMetric.getCity(), cityMetric.getLocalDate(), cityMetric.getMedian());
84  }
85
86  @Override
87  public Tuple4&lt;LocalDate, LocalDate, String, Double&gt; getResult(
88      AverageAccumulator averageAccumulator) {
89    return Tuple4.of(
90        averageAccumulator.getStart(),
91        averageAccumulator.getEnd(),
92        averageAccumulator.getCity(),
93        averageAccumulator.average());
94  }
95
96  @Override
97  public AverageAccumulator merge(AverageAccumulator acc1, AverageAccumulator acc2) {
98    return acc1.merge(acc2);
99  }
100
101  public static class AverageAccumulator {
102    private final String city;
103    private final LocalDate start;
104    private final LocalDate end;
105    private final long count;
106    private final double sum;
107
108    public AverageAccumulator() {
109      city = &quot;&quot;;
110      count = 0;
111      sum = 0;
112      start = null;
113      end = null;
114    }
115
116    AverageAccumulator(String city, LocalDate start, LocalDate end, long count, double sum) {
117      this.city = city;
118      this.count = count;
119      this.sum = sum;
120      this.start = start;
121      this.end = end;
122    }
123
124    public AverageAccumulator add(String city, LocalDate eventDate, double value) {
125      //make sure our dataflow is correct and we are summing data from the same city
126      if (!this.city.equals(&quot;&quot;) &amp;&amp; !this.city.equals(city)) {
127        throw new IllegalArgumentException(city + &quot; does not match &quot; + this.city);
128      }
129
130      return new AverageAccumulator(
131          city,
132          earliest(this.start, eventDate),
133          latest(this.end, eventDate),
134          this.count + 1,
135          this.sum + value);
136    }
137
138    public AverageAccumulator merge(AverageAccumulator that) {
139      LocalDate mergedStart = earliest(this.start, that.start);
140      LocalDate mergedEnd = latest(this.end, that.end);
141      return new AverageAccumulator(
142          this.city, mergedStart, mergedEnd, this.count + that.count, this.sum + that.sum);
143    }
144
145    private LocalDate earliest(LocalDate d1, LocalDate d2) {
146      if (d1 == null) {
147        return d2;
148      } else if (d2 == null) {
149        return d1;
150      } else {
151        return d1.isBefore(d2) ? d1 : d2;
152      }
153    }
154
155    private LocalDate latest(LocalDate d1, LocalDate d2) {
156      if (d1 == null) {
157        return d2;
158      } else if (d2 == null) {
159        return d1;
160      } else {
161        return d1.isAfter(d2) ? d1 : d2;
162      }
163    }
164
165    public double average() {
166      return sum / count;
167    }
168
169    public String getCity() {
170      return city;
171    }
172
173    public LocalDate getStart() {
174      return start;
175    }
176
177    public LocalDate getEnd() {
178      return end;
179    }
180  }
181}
182(2021-07-12,2021-07-14,Belgrade,56.666666666666664)
183(2021-07-15,2021-07-21,Belgrade,56.0)
184(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
185(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
186(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
187(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
188(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
189(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
190(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
191(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
192(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
193(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
194(2021-09-30,2021-10-03,Belgrade,61.5)
195(2021-06-28,2021-06-30,Belgrade,48.666666666666664)
196(2021-07-01,2021-07-07,Belgrade,41.142857142857146)
197(2021-07-08,2021-07-14,Belgrade,52.857142857142854)
198(2021-07-15,2021-07-21,Belgrade,56.0)
199(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
200(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
201(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
202(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
203(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
204(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
205(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
206(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
207(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
208(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
209(2021-09-30,2021-10-03,Belgrade,61.5)
210

Is there anything evidently wrong in my dataflow pipeline? Can't figure out why this would happen. It doesn't always happen on the same city either.

What could be happening?

UPDATE

So it seems that when I disabled Watermarks the problem didn't happen any more. I changed the WatermarkStrategy to the following:

1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9    private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern(&quot;dd/MM/yyyy&quot;);
10
11    private final LocalDate localDate;
12    private final String country;
13    private final String city;
14    private final String reading;
15    private final int count;
16    private final double min;
17    private final double max;
18    private final double median;
19    private final double variance;
20
21    private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22        this.localDate = localDate;
23        this.country = country;
24        this.city = city;
25        this.reading = reading;
26        this.count = count;
27        this.min = min;
28        this.max = max;
29        this.median = median;
30        this.variance = variance;
31    }
32
33    public static CityMetric fromArray(String[] arr) {
34        LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35        int count = Integer.parseInt(arr[4]);
36        double min = Double.parseDouble(arr[5]);
37        double max = Double.parseDouble(arr[6]);
38        double median = Double.parseDouble(arr[7]);
39        double variance = Double.parseDouble(arr[8]);
40
41        return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42    }
43
44    public long getTimestamp() {
45        return getLocalDate()
46                .atStartOfDay()
47                .toInstant(ZoneOffset.UTC)
48                .toEpochMilli();
49    }
50
51//getters follow
52   final WatermarkStrategy&lt;CityMetric&gt; cityMetricWatermarkStrategy =
53            WatermarkStrategy.&lt;CityMetric&gt;forMonotonousTimestamps()  //we know they are sorted by time
54                    .withTimestampAssigner((cityMetric, l) -&gt; cityMetric.getTimestamp());
55  final StreamingFileSink&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt; fileSink =
56        StreamingFileSink.forRowFormat(
57                new Path(&quot;airquality&quot;),
58                new SimpleStringEncoder&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt;(&quot;UTF-8&quot;))
59            .build();
60 source
61        .map(s -&gt; s.split(&quot;,&quot;)) //split the CSV row into its fields
62        .filter(arr -&gt; !arr[0].startsWith(&quot;Date&quot;)) // if it starts with Date it means it is the top header
63        .map(CityMetric::fromArray)  //create the object from the fields
64        .assignTimestampsAndWatermarks(cityMetricWatermarkStrategy) // we use the date as the event time
65        .filter(cm -&gt; cm.getReading().equals(&quot;pm25&quot;)) // we want air quality of fine particulate matter pm2.5
66        .keyBy(CityMetric::getCity) // partition by city name
67        .window(TumblingEventTimeWindows.of(Time.days(7))) //windows of 7 days
68        .aggregate(new CityAverageAggregate()) // average the values
69        .name(&quot;cityair&quot;)
70        .addSink(fileSink); //output each partition to a file
71public class CityAverageAggregate
72    implements AggregateFunction&lt;
73        CityMetric, CityAverageAggregate.AverageAccumulator, Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt; {
74
75  @Override
76  public AverageAccumulator createAccumulator() {
77    return new AverageAccumulator();
78  }
79
80  @Override
81  public AverageAccumulator add(CityMetric cityMetric, AverageAccumulator averageAccumulator) {
82    return averageAccumulator.add(
83        cityMetric.getCity(), cityMetric.getLocalDate(), cityMetric.getMedian());
84  }
85
86  @Override
87  public Tuple4&lt;LocalDate, LocalDate, String, Double&gt; getResult(
88      AverageAccumulator averageAccumulator) {
89    return Tuple4.of(
90        averageAccumulator.getStart(),
91        averageAccumulator.getEnd(),
92        averageAccumulator.getCity(),
93        averageAccumulator.average());
94  }
95
96  @Override
97  public AverageAccumulator merge(AverageAccumulator acc1, AverageAccumulator acc2) {
98    return acc1.merge(acc2);
99  }
100
101  public static class AverageAccumulator {
102    private final String city;
103    private final LocalDate start;
104    private final LocalDate end;
105    private final long count;
106    private final double sum;
107
108    public AverageAccumulator() {
109      city = &quot;&quot;;
110      count = 0;
111      sum = 0;
112      start = null;
113      end = null;
114    }
115
116    AverageAccumulator(String city, LocalDate start, LocalDate end, long count, double sum) {
117      this.city = city;
118      this.count = count;
119      this.sum = sum;
120      this.start = start;
121      this.end = end;
122    }
123
124    public AverageAccumulator add(String city, LocalDate eventDate, double value) {
125      //make sure our dataflow is correct and we are summing data from the same city
126      if (!this.city.equals(&quot;&quot;) &amp;&amp; !this.city.equals(city)) {
127        throw new IllegalArgumentException(city + &quot; does not match &quot; + this.city);
128      }
129
130      return new AverageAccumulator(
131          city,
132          earliest(this.start, eventDate),
133          latest(this.end, eventDate),
134          this.count + 1,
135          this.sum + value);
136    }
137
138    public AverageAccumulator merge(AverageAccumulator that) {
139      LocalDate mergedStart = earliest(this.start, that.start);
140      LocalDate mergedEnd = latest(this.end, that.end);
141      return new AverageAccumulator(
142          this.city, mergedStart, mergedEnd, this.count + that.count, this.sum + that.sum);
143    }
144
145    private LocalDate earliest(LocalDate d1, LocalDate d2) {
146      if (d1 == null) {
147        return d2;
148      } else if (d2 == null) {
149        return d1;
150      } else {
151        return d1.isBefore(d2) ? d1 : d2;
152      }
153    }
154
155    private LocalDate latest(LocalDate d1, LocalDate d2) {
156      if (d1 == null) {
157        return d2;
158      } else if (d2 == null) {
159        return d1;
160      } else {
161        return d1.isAfter(d2) ? d1 : d2;
162      }
163    }
164
165    public double average() {
166      return sum / count;
167    }
168
169    public String getCity() {
170      return city;
171    }
172
173    public LocalDate getStart() {
174      return start;
175    }
176
177    public LocalDate getEnd() {
178      return end;
179    }
180  }
181}
182(2021-07-12,2021-07-14,Belgrade,56.666666666666664)
183(2021-07-15,2021-07-21,Belgrade,56.0)
184(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
185(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
186(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
187(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
188(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
189(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
190(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
191(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
192(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
193(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
194(2021-09-30,2021-10-03,Belgrade,61.5)
195(2021-06-28,2021-06-30,Belgrade,48.666666666666664)
196(2021-07-01,2021-07-07,Belgrade,41.142857142857146)
197(2021-07-08,2021-07-14,Belgrade,52.857142857142854)
198(2021-07-15,2021-07-21,Belgrade,56.0)
199(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
200(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
201(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
202(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
203(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
204(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
205(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
206(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
207(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
208(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
209(2021-09-30,2021-10-03,Belgrade,61.5)
210    final WatermarkStrategy&lt;CityMetric&gt; cityMetricWatermarkStrategy =
211            WatermarkStrategy.&lt;CityMetric&gt;noWatermarks()  
212                             .withTimestampAssigner((cityMetric, l) -&gt; cityMetric.getTimestamp());
213

And so far I have been getting consistent results. When I checked the documentation it says that:

static WatermarkStrategy noWatermarks()

Creates a watermark strategy that generates no watermarks at all. This may be useful in scenarios that do pure processing-time based stream processing.

But I am not doing processing-time based stream processing, I am doing event-time processing.

Why would forMonotonousTimestamps() have the strange behaviour I was seeing? Indeed my timestamps are monotonically increasing (the noWatermarks strategy wouldn't work if they weren't), but somehow changing this does not work well with my scenario.

Is there anything I am missing with the way things work in Flink?

ANSWER

Answered 2022-Feb-14 at 20:51

Flink doesn't support per-key watermarking. Each parallel task generates watermarks independently, based on observing all of the events flowing through that task.

So the reason this isn't working with the forMonotonousTimestamps watermark strategy is that the input is not actually in order by timestamp. It is temporally sorted within each city, but not globally. This is then going to result in some records being late, but unpredictably so, depending on exactly when watermarks are generated. These late events are being ignored by the windows that should contain them.

You can address this in a number of ways:

(1) Use a forBoundedOutOfOrderness watermark strategy with a duration sufficient to account for the actual out-of-order-ness in the dataset. Given that the data looks something like this:

1Date,Country,City,Specie,count,min,max,median,variance
228/06/2021,GR,Athens,no2,116,0.5,58.9,5.5,2824.39
328/06/2021,GR,Athens,wind-speed,133,0.1,11.2,3,96.69
428/06/2021,GR,Athens,dew,24,14,20,18,35.92
528/06/2021,GR,Athens,temperature,141,24.4,38.4,30.5,123.18
628/06/2021,GR,Athens,pm25,116,34,85,68,702.29
7public class CityMetric {
8
9    private static final DateTimeFormatter dateFormatter = DateTimeFormatter.ofPattern(&quot;dd/MM/yyyy&quot;);
10
11    private final LocalDate localDate;
12    private final String country;
13    private final String city;
14    private final String reading;
15    private final int count;
16    private final double min;
17    private final double max;
18    private final double median;
19    private final double variance;
20
21    private CityMetric(LocalDate localDate, String country, String city, String reading, int count, double min, double max, double median, double variance) {
22        this.localDate = localDate;
23        this.country = country;
24        this.city = city;
25        this.reading = reading;
26        this.count = count;
27        this.min = min;
28        this.max = max;
29        this.median = median;
30        this.variance = variance;
31    }
32
33    public static CityMetric fromArray(String[] arr) {
34        LocalDate date = LocalDate.parse(arr[0], dateFormatter);
35        int count = Integer.parseInt(arr[4]);
36        double min = Double.parseDouble(arr[5]);
37        double max = Double.parseDouble(arr[6]);
38        double median = Double.parseDouble(arr[7]);
39        double variance = Double.parseDouble(arr[8]);
40
41        return new CityMetric(date, arr[1], arr[2], arr[3], count, min, max, median, variance);
42    }
43
44    public long getTimestamp() {
45        return getLocalDate()
46                .atStartOfDay()
47                .toInstant(ZoneOffset.UTC)
48                .toEpochMilli();
49    }
50
51//getters follow
52   final WatermarkStrategy&lt;CityMetric&gt; cityMetricWatermarkStrategy =
53            WatermarkStrategy.&lt;CityMetric&gt;forMonotonousTimestamps()  //we know they are sorted by time
54                    .withTimestampAssigner((cityMetric, l) -&gt; cityMetric.getTimestamp());
55  final StreamingFileSink&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt; fileSink =
56        StreamingFileSink.forRowFormat(
57                new Path(&quot;airquality&quot;),
58                new SimpleStringEncoder&lt;Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt;(&quot;UTF-8&quot;))
59            .build();
60 source
61        .map(s -&gt; s.split(&quot;,&quot;)) //split the CSV row into its fields
62        .filter(arr -&gt; !arr[0].startsWith(&quot;Date&quot;)) // if it starts with Date it means it is the top header
63        .map(CityMetric::fromArray)  //create the object from the fields
64        .assignTimestampsAndWatermarks(cityMetricWatermarkStrategy) // we use the date as the event time
65        .filter(cm -&gt; cm.getReading().equals(&quot;pm25&quot;)) // we want air quality of fine particulate matter pm2.5
66        .keyBy(CityMetric::getCity) // partition by city name
67        .window(TumblingEventTimeWindows.of(Time.days(7))) //windows of 7 days
68        .aggregate(new CityAverageAggregate()) // average the values
69        .name(&quot;cityair&quot;)
70        .addSink(fileSink); //output each partition to a file
71public class CityAverageAggregate
72    implements AggregateFunction&lt;
73        CityMetric, CityAverageAggregate.AverageAccumulator, Tuple4&lt;LocalDate, LocalDate, String, Double&gt;&gt; {
74
75  @Override
76  public AverageAccumulator createAccumulator() {
77    return new AverageAccumulator();
78  }
79
80  @Override
81  public AverageAccumulator add(CityMetric cityMetric, AverageAccumulator averageAccumulator) {
82    return averageAccumulator.add(
83        cityMetric.getCity(), cityMetric.getLocalDate(), cityMetric.getMedian());
84  }
85
86  @Override
87  public Tuple4&lt;LocalDate, LocalDate, String, Double&gt; getResult(
88      AverageAccumulator averageAccumulator) {
89    return Tuple4.of(
90        averageAccumulator.getStart(),
91        averageAccumulator.getEnd(),
92        averageAccumulator.getCity(),
93        averageAccumulator.average());
94  }
95
96  @Override
97  public AverageAccumulator merge(AverageAccumulator acc1, AverageAccumulator acc2) {
98    return acc1.merge(acc2);
99  }
100
101  public static class AverageAccumulator {
102    private final String city;
103    private final LocalDate start;
104    private final LocalDate end;
105    private final long count;
106    private final double sum;
107
108    public AverageAccumulator() {
109      city = &quot;&quot;;
110      count = 0;
111      sum = 0;
112      start = null;
113      end = null;
114    }
115
116    AverageAccumulator(String city, LocalDate start, LocalDate end, long count, double sum) {
117      this.city = city;
118      this.count = count;
119      this.sum = sum;
120      this.start = start;
121      this.end = end;
122    }
123
124    public AverageAccumulator add(String city, LocalDate eventDate, double value) {
125      //make sure our dataflow is correct and we are summing data from the same city
126      if (!this.city.equals(&quot;&quot;) &amp;&amp; !this.city.equals(city)) {
127        throw new IllegalArgumentException(city + &quot; does not match &quot; + this.city);
128      }
129
130      return new AverageAccumulator(
131          city,
132          earliest(this.start, eventDate),
133          latest(this.end, eventDate),
134          this.count + 1,
135          this.sum + value);
136    }
137
138    public AverageAccumulator merge(AverageAccumulator that) {
139      LocalDate mergedStart = earliest(this.start, that.start);
140      LocalDate mergedEnd = latest(this.end, that.end);
141      return new AverageAccumulator(
142          this.city, mergedStart, mergedEnd, this.count + that.count, this.sum + that.sum);
143    }
144
145    private LocalDate earliest(LocalDate d1, LocalDate d2) {
146      if (d1 == null) {
147        return d2;
148      } else if (d2 == null) {
149        return d1;
150      } else {
151        return d1.isBefore(d2) ? d1 : d2;
152      }
153    }
154
155    private LocalDate latest(LocalDate d1, LocalDate d2) {
156      if (d1 == null) {
157        return d2;
158      } else if (d2 == null) {
159        return d1;
160      } else {
161        return d1.isAfter(d2) ? d1 : d2;
162      }
163    }
164
165    public double average() {
166      return sum / count;
167    }
168
169    public String getCity() {
170      return city;
171    }
172
173    public LocalDate getStart() {
174      return start;
175    }
176
177    public LocalDate getEnd() {
178      return end;
179    }
180  }
181}
182(2021-07-12,2021-07-14,Belgrade,56.666666666666664)
183(2021-07-15,2021-07-21,Belgrade,56.0)
184(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
185(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
186(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
187(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
188(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
189(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
190(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
191(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
192(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
193(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
194(2021-09-30,2021-10-03,Belgrade,61.5)
195(2021-06-28,2021-06-30,Belgrade,48.666666666666664)
196(2021-07-01,2021-07-07,Belgrade,41.142857142857146)
197(2021-07-08,2021-07-14,Belgrade,52.857142857142854)
198(2021-07-15,2021-07-21,Belgrade,56.0)
199(2021-07-22,2021-07-28,Belgrade,57.285714285714285)
200(2021-07-29,2021-08-04,Belgrade,43.57142857142857)
201(2021-08-05,2021-08-11,Belgrade,35.42857142857143)
202(2021-08-12,2021-08-18,Belgrade,43.42857142857143)
203(2021-08-19,2021-08-25,Belgrade,36.857142857142854)
204(2021-08-26,2021-09-01,Belgrade,50.285714285714285)
205(2021-09-02,2021-09-08,Belgrade,46.285714285714285)
206(2021-09-09,2021-09-15,Belgrade,54.857142857142854)
207(2021-09-16,2021-09-22,Belgrade,56.714285714285715)
208(2021-09-23,2021-09-29,Belgrade,59.285714285714285)
209(2021-09-30,2021-10-03,Belgrade,61.5)
210    final WatermarkStrategy&lt;CityMetric&gt; cityMetricWatermarkStrategy =
211            WatermarkStrategy.&lt;CityMetric&gt;noWatermarks()  
212                             .withTimestampAssigner((cityMetric, l) -&gt; cityMetric.getTimestamp());
21303/10/2021,GR,Athens,pressure,60,1017.9,1040.6,1020.9,542.4
21428/06/2021,US,Atlanta,co,24,1.4,7.3,2.2,19.05
215

that will require an out-of-order-ness duration of approximately 100 days.

(2) Configure the windows to have sufficient allowed lateness. This will result in some of the windows being triggered multiple times -- once when the watermark indicates they can close, and again each time a late event is added to the window.

(3) Use the noWatermarks strategy. This will lead to the job only producing results if and when it reaches the end of its input file(s). For a continuous streaming job this wouldn't be workable, but for finite (bounded) inputs this can work.

(4) Run the job in RuntimeExecutionMode.BATCH mode. Then the job will only produce results at the end, after having consumed all of its input. This will run the job with a more optimized runtime designed for batch workloads, but the outcome should be the same as with (3).

(5) Change the input so it isn't out-of-order.

Source https://stackoverflow.com/questions/71091773

QUESTION

Do I need a JAR file to run a Flink application?

Asked 2022-Jan-20 at 21:29

I am working through the book Stream Processing with Apache Flink by Fabian Hueske and Vasiliki Kalavri.

The book includes an example Flink application and I want to figure out how to run the file. It is a scala file located here in their github repo.

Must I turn it into a JAR file before I run it? If so, how do I convert the file to JAR?

ANSWER

Answered 2022-Jan-20 at 21:29

Except for SQL queries submitted with Flink‘s SQL client, a user needs to package a JAR file. Usually, a Flink program file can be packaged using a Maven or Gradle project:

https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/project-configuration/#maven-quickstart

Source https://stackoverflow.com/questions/70793217

QUESTION

Calling Hibernate in Spring cloud Stream

Asked 2022-Jan-03 at 10:17

I'm new to Spring cloud stream.

Say I Spring cloud stream app that listen to some topic from kafka using @StreamListener("input-channel").

I want to do some calculation and send the result to another topic but in the middle of the processing I also need to call Hibernate (via spring data jpa) to persist some data to my mySQL data base.

Is it valid to call Hibernate in the middle of stream processing? is there other pattern to do it?

ANSWER

Answered 2022-Jan-03 at 10:17

Yes, it's a database call, so why not. People do it all the time.

Also, @StreamListener, has been deprecated for 3 years now, and is already removed from the new versions, so please transition to functional programming model

Source https://stackoverflow.com/questions/70358641

QUESTION

Filtering in Kafka and other streaming technologies

Asked 2021-Dec-27 at 07:46

I am currently doing some research about which stream processing technology to use. So far I have looked at message queueing technologies and streaming frameworks. I am now leaning towards Apache Kafka or Google Pub/Sub.

The requirements I have:

  • Deliver, read and process messages/events in real time.
  • Persistence in the messages/events.
  • Ability to filter messages/event in real time with out having to read entire topic. For example: if I have topic called ‘details’, I want to be able to filter out the messages/events out of that topic where an attribute of an event equals a certain value.
  • Ability to see if the producer to a certain topic or queue is finished.
  • Ability to delete messages/events in a topic based on an attribute within an event equaling a certain value.
  • Ordering in messages/events.

My question is: what is the best framework/technology for these use cases? From what I have read so far, Kafka doesn’t provide that out of the boxes filtering approach for messages/events in topics and Google Pub/Sub does have a filter approach.

Any suggestions and experience would be welcome.

ANSWER

Answered 2021-Dec-27 at 07:46

As per the requirements you mentioned kafka seems a nice fit, using kafka streams or KSQL you can perform filtering in real-time, here is an example https://kafka-tutorials.confluent.io/filter-a-stream-of-events/confluent.html

What you need is more than just integration and data transfer, you need something similar to what is known as ETL tool, here you can find more about ETL and tools in GCP https://cloud.google.com/learn/what-is-etl

Source https://stackoverflow.com/questions/70490345

QUESTION

Pexpect Multi-Threading Idle State

Asked 2021-Dec-10 at 14:52

We have ~15,000 nodes to log into and pull data from via Pexpect. To speed this up, I am doing multiprocessing - splitting the load equally between 12 cores. That works great, but this is still over 1000 nodes per core - processed one at a time.

The CPU utilization of each core as it does this processing is roughly 2%. And that sort of makes sense, as most of the time is just waiting for to see the Pexpect expect value as the node streams output. To try and take advantage of this and speed things up further, I want to implement multi-threading within the multi-processing on each core.

To attempt avoid any issues with shared variables, I put all data needed to log into a node in a dictionary (one key in dictionary per node), and then slice the dictionary, with each thread receiving a unique slice. Then after the threads are done, I combine the dictionary slices back together.

However, I am still seeing one thread completely finish before moving to the next.

I am wondering what constitutes an idle state such that a core can be moved to work on another thread? Does the fact that it is always looking for the Pexpect expect value mean it is never idle?

Also, as I use the same target function for each thread. I am not sure if perhaps that target function being the same for each thread (same vars local to that function) is influencing this?

My multi-threading code is below, for reference.

Thanks for any insight!

1import threading
2import &lt;lots of other stuff&gt;
3
4class ThreadClass(threading.Thread):
5    def __init__(self, outputs_dict_split):
6        super(ThreadClass, self).__init__()
7        self.outputs_dict_split = outputs_dict_split
8    def run(self):
9        outputs_dict_split = get_output(self.outputs_dict_split)
10        return outputs_dict_split
11
12def get_output(outputs_dict):
13    ### PEXPECT STUFF TO LOGIN AND RUN COMMANDS ####
14    ### WRITE DEVICE'S OUTPUTS TO DEVICE'S OUTPUTS_DICT RESULTS SUB-KEY ###
15
16def backbone(outputs_dict):
17    filterbykey = lambda keys: {x: outputs_dict[x] for x in keys}
18    num_threads = 2
19    device_split = np.array_split(list(outputs_dict.keys()), num_threads)
20
21    outputs_dict_split_list = []
22    split_list1 = list(device_split[0])
23    split_list2 = list(device_split[1])
24    outputs_dict_split1 = filterbykey(split_list1)
25    outputs_dict_split2 = filterbykey(split_list2)
26    t1 = ThreadClass(outputs_dict_split1)
27    t2 = ThreadClass(outputs_dict_split2)
28    t1.start()
29    t2.start()
30    t1.join()
31    t2.join()
32    outputs_dict_split1 = t1.outputs_dict_split
33    outputs_dict_split2 = t2.outputs_dict_split
34    outputs_dict_split_list.append(outputs_dict_split1)
35    outputs_dict_split_list.append(outputs_dict_split2)
36    outputs_dict = ChainMap(*outputs_dict_split_list)
37
38    ### Downstream Processing ###
39

ANSWER

Answered 2021-Dec-10 at 14:52

This actually worked. However, I had to scale the number of devices being processed in order to see substantial improvements in overall processing time.

Source https://stackoverflow.com/questions/70218408

QUESTION

Apache Flink - how to stop and resume stream processing on downstream failure

Asked 2021-Nov-22 at 04:53

I have a Flink application that consumes incoming messages on a Kafka topic with multiple partitions, does some processing then sends them to a sink that sends them over HTTP to an external service. Sometimes the downstream service is down the stream processing needs to stop until it is back in action.

There are two approaches I am considering.

  1. Throw an exception when the Http sink fails to send the output message. This will cause the task and job to restart according to the configured restart strategy. Eventually the downstream service will be back and the system will continue where it left off.
  2. Have the Sink sleep and retry on failure; it can do this continually until the downstream service is back.

From what I understand and from my PoC, with 1. I will lose exactly-least once guarantees since the sink itself is external state. As far as I can see, you cannot make a simple HTTP endpoint transactional, as it needs to be to implement TwoPhaseCommitSinkFunction.

With 2. this is less of an issue since pipeline will not proceed until the sink makes a successful write, and I can rely on back pressure throughout the system to pause the retrieval of messages from the Kafka source.

The main questions I have are:

  1. Is it a correct assumption that you can't make a TwoPhaseCommitSinkFunction for a simple HTTP endpoint?
  2. Which of the two strategies, or neither, makes the most sense?
  3. Am I missing simpler obvious solutions?

ANSWER

Answered 2021-Nov-22 at 04:53

I think you can try AsyncIO in Flink - https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/asyncio/.

Try to make the HTTP endpoint send a response once all operation has been done for the request, e.g. In http server, the process for the request has been done and the result has been committed to DB. Then use a http async client in AsyncIO operator. The AsyncIO operator will wait until the response is received by the operator. If any error happened, the Flink streaming pipeline will fail and restart the pipeline based on recovery strategy.

All requests to HTTP endpoint without receiving response will be in the internal buffer of AsyncIO operator, and once streaming pipeline failed, the requests pending in the buffer will be saved in the checkpoint state. It will also trigger back pressure when the internal buffer is full.

Source https://stackoverflow.com/questions/70052227

QUESTION

Hazelcast IMDG vs Hazelcast Jet config

Asked 2021-Oct-12 at 22:08

How to read Hazlecast IMDG data in the Hazelcast jet.

In my case I required both Hazlecast IMDG (Distributed cache) to store data for the future and also jet to perform batch and stream processing.

So I will be saving data using Hazelcast IMDG(MapStore) and filtering using Hazelcast jet.

1public class Test {
2
3    HazelcastInstance hz = Hazelcast.newHazelcastInstance();
4    JetInstance jet = Jet.newJetInstance();
5
6    public static void main(String[] args) {
7        Test t = new Test();
8        t.loadIntoIMap();
9        t.readFromIMap();
10    }
11
12    public void loadIntoIMap() {
13        IMap&lt;String, String&gt; map = hz.getMap(&quot;my-distributed-map&quot;);
14        // Standard Put and Get
15        map.put(&quot;1&quot;, &quot;John&quot;);
16        map.put(&quot;2&quot;, &quot;Mary&quot;);
17        map.put(&quot;3&quot;, &quot;Jane&quot;);
18    }
19
20    public void readFromIMap() {
21        System.err.println(&quot;--manu---&quot;);
22        jet.getMap(&quot;s&quot;).put(&quot;1&quot;, &quot;2&quot;);
23        System.err.println(jet.getMap(&quot;s&quot;).size());
24        System.err.println(jet.getMap(&quot;my-distributed-map&quot;).size());
25    }
26
27}
28
29

Do we need separate configuration both(jet and IMDG) or in single config can I share Hz IMap data inside jet.

I'm little confused between jet and Hazelcast IMDG

ANSWER

Answered 2021-Oct-12 at 22:08

The answer differs depending on the version you want to use.

IMDG up to 4.2 and Jet 4.5

Hazelcast Jet is built on top of Hazelcast IMDG. When you start a Jet instance there is automatically an IMDG instance running. There is JetInstance#getHazelcastInstance method to retrieve the IMDG instance from Jet instance and JetConfig#setHazelcastConfig to configure IMDG specific configs.

You can access the maps from your cluster in Jet using com.hazelcast.jet.pipeline.Sources#map(String)

You should not start both IMDG and Jet separately on the same machine. However, you can create 2 clusters, one IMDG, one Jet and connect from Jet using com.hazelcast.jet.pipeline.Sources#remoteMap(String, ClientConfig) and similar for other data structures.

If you are already using Hazelcast it's likely this version.

Hazelcast 5.0

With a recent 5.0 release these two products were merged together. There is a single artefact to use - com.hazelcast:hazelcast. You just create a Hazelcast instance and, if enabled, you can get the Jet engine from there using HazelcastInstance#getJet

5.0 is 100 % compatible with IMDG 4.2, just change the dependency and mostly compatible with Jet 4.5, some code changes are needed though.

Source https://stackoverflow.com/questions/69537832

QUESTION

What is benefit of using Kafka streams?

Asked 2021-Oct-07 at 15:28

I try to realize what is benefit of using Kafka stream in my business model. The customers publish an order and instantly gets offers from sellers who are online and intrested in this order.

In this case the streams are fit to join available sellers(online) to order stream and filter, sorting (by price) of offers. So as result the customer should give the best offers by price by request.

I discovered only one benefit: it is less of server calls(all calculations happends in stream).

My question is, why streams are matter in this case? Because I can make these business steps using the standard approach with the one monolithic application?

I know this question is opinion based, but after reading some books about stream processing it is still to hard change the mind on this approach.

ANSWER

Answered 2021-Oct-07 at 15:28

only one benefit: it is less of server calls

Kafka Streams can still do "server calls", especially when using Interactive Queries with an RPC layer. Fetching data from a remote table, such as KSQLdb, is also a "server call".

This is not the only benefit. Have you tried to write a join between topics using regular consumer API? Or a filter/map in less than 2 lines of code (outside the config setup)?

can make these business steps using the standard approach with the one monolithic application?

A Streams topology can still be emdedded within a monolith, so I don't understand your point here. I assume you mean a fully synchronous application with a traditional database + API layer?


The books you say you've read should go over most benefits of stream processing, but you might want to checkout "Kafka Streams in Action" to specifically get the advantages of that

Source https://stackoverflow.com/questions/69480561

QUESTION

How to filter data in Kafka?

Asked 2021-Sep-29 at 09:50

My understanding is I can filter data using stream and put it to specific topics.

Problem : The producer sends data with field country. Then stream processing filters these data and puts to topics by country code.

As result those consumers who are subscribed to specific countries(code) would get message.

Problem is it requires a lot of topics by count of countries. And in the feature I need to do the same with countries.

How to organize it in Kafka and filter data?

ANSWER

Answered 2021-Sep-29 at 09:50

You have few options here :

Kafka Streaming : With kafka streaming you can filter data as per your need and write it to the new topics. Consumers can consume messages from those new topics.

Filter Data on the Consumer Side : You consume the data and filter the data as per required criteria on the consumer side.

Use Separate partitions for separate country code : You define total partitions of this topic as per the number of country codes and make country code as key. Now make your consumers direct to right partition for consuming country specific messages.

Source https://stackoverflow.com/questions/69373819

Community Discussions contain sources that include Stack Exchange Network

Tutorials and Learning Resources in Stream Processing

Tutorials and Learning Resources are not available at this moment for Stream Processing

Share this Page

share link

Get latest updates on Stream Processing