snappy | Snappy compression format in the Go programming language

 by   golang Go Version: v0.0.4 License: BSD-3-Clause

kandi X-RAY | snappy Summary

snappy is a Go library. snappy has no bugs, it has a Permissive License and it has medium support. However snappy has 1 vulnerabilities. You can download it from GitHub.
The Snappy compression format in the Go programming language. To download and install from source: $ go get github.com/golang/snappy. Unless otherwise noted, the Snappy-Go source files are distributed under the BSD-style license found in the LICENSE file.
    Support
      Quality
        Security
          License
            Reuse
            Support
              Quality
                Security
                  License
                    Reuse

                      kandi-support Support

                        summary
                        snappy has a medium active ecosystem.
                        summary
                        It has 1347 star(s) with 165 fork(s). There are 45 watchers for this library.
                        summary
                        It had no major release in the last 6 months.
                        summary
                        There are 9 open issues and 27 have been closed. On average issues are closed in 97 days. There are 5 open pull requests and 0 closed requests.
                        summary
                        It has a neutral sentiment in the developer community.
                        summary
                        The latest version of snappy is v0.0.4
                        snappy Support
                          Best in #Go
                            Average in #Go
                            snappy Support
                              Best in #Go
                                Average in #Go

                                  kandi-Quality Quality

                                    summary
                                    snappy has 0 bugs and 0 code smells.
                                    snappy Quality
                                      Best in #Go
                                        Average in #Go
                                        snappy Quality
                                          Best in #Go
                                            Average in #Go

                                              kandi-Security Security

                                                summary
                                                snappy has 1 vulnerability issues reported (0 critical, 1 high, 0 medium, 0 low).
                                                summary
                                                snappy code analysis shows 0 unresolved vulnerabilities.
                                                summary
                                                There are 0 security hotspots that need review.
                                                snappy Security
                                                  Best in #Go
                                                    Average in #Go
                                                    snappy Security
                                                      Best in #Go
                                                        Average in #Go

                                                          kandi-License License

                                                            summary
                                                            snappy is licensed under the BSD-3-Clause License. This license is Permissive.
                                                            summary
                                                            Permissive licenses have the least restrictions, and you can use them in most projects.
                                                            snappy License
                                                              Best in #Go
                                                                Average in #Go
                                                                snappy License
                                                                  Best in #Go
                                                                    Average in #Go

                                                                      kandi-Reuse Reuse

                                                                        summary
                                                                        snappy releases are not available. You will need to build from source code and install.
                                                                        snappy Reuse
                                                                          Best in #Go
                                                                            Average in #Go
                                                                            snappy Reuse
                                                                              Best in #Go
                                                                                Average in #Go
                                                                                  Top functions reviewed by kandi - BETA
                                                                                  kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
                                                                                  Currently covering the most popular Java, JavaScript and Python libraries. See a Sample Here
                                                                                  Get all kandi verified functions for this library.
                                                                                  Get all kandi verified functions for this library.

                                                                                  snappy Key Features

                                                                                  The Snappy compression format in the Go programming language.

                                                                                  snappy Examples and Code Snippets

                                                                                  No Code Snippets are available at this moment for snappy.
                                                                                  Community Discussions

                                                                                  Trending Discussions on snappy

                                                                                  Dynamic stage path in snowflake
                                                                                  chevron right
                                                                                  Jetpack Compose LazyRow scroll with snap only to start of next or previous element
                                                                                  chevron right
                                                                                  Spring Boot Logging to a File
                                                                                  chevron right
                                                                                  The Kafka topic is here, a Java consumer program finds it, but lists none of its content, while a kafka-console-consumer is able to
                                                                                  chevron right
                                                                                  Error when running Pytest with DeltaTables
                                                                                  chevron right
                                                                                  How can I have nice file names & efficient storage usage in my Foundry Magritte dataset export?
                                                                                  chevron right
                                                                                  Upserts on Delta simply duplicates data?
                                                                                  chevron right
                                                                                  OWL API NoSuchMethodError in saveOntology() call
                                                                                  chevron right
                                                                                  pyarrow reading parquet from S3 performance confusions
                                                                                  chevron right
                                                                                  Dask ParserError: Error tokenizing data when reading CSV
                                                                                  chevron right

                                                                                  QUESTION

                                                                                  Dynamic stage path in snowflake
                                                                                  Asked 2022-Mar-14 at 10:31

                                                                                  I have a stage path as below

                                                                                  copy into table1 as (
                                                                                  select $1:InvestorID::varchar as Investor_ID from  @company_stage/pbook/2022-03-10/Invor/part-00000-33cbc68b-69c1-40c0-943c-f586dfab3f49-c000.snappy.parquet
                                                                                  
                                                                                  )
                                                                                  

                                                                                  This is my S3 location company_stage/pbook/2022-03-10/Invor,

                                                                                  I need to make this dynamic:

                                                                                  I) I need to change this "2022-03-10" folder to current date

                                                                                  II)it must take all parquet files in the folder automatically, without me mentioning of filename. How to achieve this?

                                                                                  ANSWER

                                                                                  Answered 2022-Mar-14 at 10:31

                                                                                  Here is one approach. Your stage shouldn't include the date as part of the stage name because if it did, you would need a new stage every day. Better to define the stage as company_stage/pbook/.

                                                                                  To make it dynamic, I suggest using the pattern option together with the COPY INTO command. You could create a variable with the regex pattern expression using current_date(), something like this:

                                                                                  set mypattern = '\.*'||to_char(current_date(), 'YYYY-MM-DD')||'\.*';
                                                                                  

                                                                                  Then use this variable in your COPY INTO command like this:

                                                                                  copy into table1 as (
                                                                                  select $1:InvestorID::varchar as Investor_ID from  @company_stage/pbook/ pattern = $mypattern
                                                                                  )
                                                                                  

                                                                                  Of course you can adjust your pattern matching as you see fit.

                                                                                  Source https://stackoverflow.com/questions/71453827

                                                                                  QUESTION

                                                                                  Jetpack Compose LazyRow scroll with snap only to start of next or previous element
                                                                                  Asked 2022-Mar-10 at 18:17

                                                                                  Is there a way to horizontally scroll only to start or specified position of previous or next element with Jetpack Compose?

                                                                                  Snappy scrolling in RecyclerView

                                                                                  ANSWER

                                                                                  Answered 2021-Aug-22 at 19:08

                                                                                  You can check the scrolling direction like so

                                                                                  @Composable
                                                                                  private fun LazyListState.isScrollingUp(): Boolean {
                                                                                      var previousIndex by remember(this) { mutableStateOf(firstVisibleItemIndex) }
                                                                                      var previousScrollOffset by remember(this) { mutableStateOf(firstVisibleItemScrollOffset) }
                                                                                      return remember(this) {
                                                                                          derivedStateOf {
                                                                                              if (previousIndex != firstVisibleItemIndex) {
                                                                                                  previousIndex > firstVisibleItemIndex
                                                                                              } else {
                                                                                                  previousScrollOffset >= firstVisibleItemScrollOffset
                                                                                              }.also {
                                                                                                  previousIndex = firstVisibleItemIndex
                                                                                                  previousScrollOffset = firstVisibleItemScrollOffset
                                                                                              }
                                                                                          }
                                                                                      }.value
                                                                                  }
                                                                                  

                                                                                  Of course, you will need to create a rememberLazyListState(), and then pass it to the list as a parameter.

                                                                                  Then, based upon the scrolling direction, you can call lazyListState.scrollTo(lazyListState.firstVisibleItemIndex + 1) in a coroutine (if the user is scrolling right), and appropriate calls for the other direction.

                                                                                  Source https://stackoverflow.com/questions/68882038

                                                                                  QUESTION

                                                                                  Spring Boot Logging to a File
                                                                                  Asked 2022-Feb-16 at 14:49

                                                                                  In my application config i have defined the following properties:

                                                                                  logging.file.name  = application.logs
                                                                                  

                                                                                  When i run my application it's creating two files application.logs.0 and application.logs.0.lck and the content of file as follow

                                                                                  
                                                                                  
                                                                                  
                                                                                  
                                                                                    2022-02-16T12:55:05.656986Z
                                                                                    1645016105656
                                                                                    986000
                                                                                    0
                                                                                    org.apache.catalina.core.StandardService
                                                                                    INFO
                                                                                    org.apache.catalina.core.StandardService
                                                                                    startInternal
                                                                                    1
                                                                                    Starting service [Tomcat]
                                                                                  
                                                                                  
                                                                                    2022-02-16T12:55:05.671696Z
                                                                                    1645016105671
                                                                                    696000
                                                                                    1
                                                                                    org.apache.catalina.core.StandardEngine
                                                                                    INFO
                                                                                    org.apache.catalina.core.StandardEngine
                                                                                    startInternal
                                                                                    1
                                                                                    Starting Servlet engine: [Apache Tomcat/9.0.48]
                                                                                  
                                                                                  

                                                                                  It's not properly printing logs and don't want to output in the xml format

                                                                                  My Dependency Tree:

                                                                                  [INFO] com.walmart.uss:trigger:jar:0.0.1-SNAPSHOT
                                                                                  [INFO] +- com.google.cloud:google-cloud-logging:jar:3.0.0:compile
                                                                                  [INFO] |  +- com.google.guava:guava:jar:31.0.1-jre:compile
                                                                                  [INFO] |  +- com.google.guava:failureaccess:jar:1.0.1:compile
                                                                                  [INFO] |  +- com.google.guava:listenablefuture:jar:9999.0-empty-to-avoid-conflict-with-guava:compile
                                                                                  [INFO] |  +- com.google.code.findbugs:jsr305:jar:3.0.2:compile
                                                                                  [INFO] |  +- org.checkerframework:checker-qual:jar:3.8.0:compile
                                                                                  [INFO] |  +- com.google.errorprone:error_prone_annotations:jar:2.8.1:compile
                                                                                  [INFO] |  +- com.google.j2objc:j2objc-annotations:jar:1.3:compile
                                                                                  [INFO] |  +- io.grpc:grpc-api:jar:1.41.0:compile
                                                                                  [INFO] |  +- io.grpc:grpc-context:jar:1.41.0:compile
                                                                                  [INFO] |  +- io.grpc:grpc-stub:jar:1.41.0:compile
                                                                                  [INFO] |  +- io.grpc:grpc-protobuf:jar:1.41.0:compile
                                                                                  [INFO] |  +- io.grpc:grpc-protobuf-lite:jar:1.41.0:compile
                                                                                  [INFO] |  +- com.google.api:api-common:jar:2.0.5:compile
                                                                                  [INFO] |  +- javax.annotation:javax.annotation-api:jar:1.3.2:compile
                                                                                  [INFO] |  +- com.google.auto.value:auto-value-annotations:jar:1.8.2:compile
                                                                                  [INFO] |  +- com.google.protobuf:protobuf-java:jar:3.18.1:compile
                                                                                  [INFO] |  +- com.google.protobuf:protobuf-java-util:jar:3.18.1:compile
                                                                                  [INFO] |  +- com.google.code.gson:gson:jar:2.8.7:compile
                                                                                  [INFO] |  +- com.google.api.grpc:proto-google-common-protos:jar:2.6.0:compile
                                                                                  [INFO] |  +- com.google.api.grpc:proto-google-cloud-logging-v2:jar:0.92.0:compile
                                                                                  [INFO] |  +- com.google.api:gax:jar:2.6.1:compile
                                                                                  [INFO] |  +- io.opencensus:opencensus-api:jar:0.28.0:compile
                                                                                  [INFO] |  +- com.google.api:gax-grpc:jar:2.6.1:compile
                                                                                  [INFO] |  +- io.grpc:grpc-auth:jar:1.41.0:compile
                                                                                  [INFO] |  +- com.google.auth:google-auth-library-credentials:jar:1.2.1:compile
                                                                                  [INFO] |  +- io.grpc:grpc-netty-shaded:jar:1.41.0:compile
                                                                                  [INFO] |  +- io.grpc:grpc-alts:jar:1.41.0:compile
                                                                                  [INFO] |  +- io.grpc:grpc-grpclb:jar:1.41.0:compile
                                                                                  [INFO] |  +- org.conscrypt:conscrypt-openjdk-uber:jar:2.5.1:compile
                                                                                  [INFO] |  +- org.threeten:threetenbp:jar:1.5.1:compile
                                                                                  [INFO] |  +- com.google.cloud:google-cloud-core-grpc:jar:2.2.0:compile
                                                                                  [INFO] |  +- com.google.auth:google-auth-library-oauth2-http:jar:1.2.1:compile
                                                                                  [INFO] |  +- com.google.http-client:google-http-client-gson:jar:1.40.1:compile
                                                                                  [INFO] |  +- com.google.http-client:google-http-client:jar:1.40.1:compile
                                                                                  [INFO] |  +- commons-logging:commons-logging:jar:1.2:compile
                                                                                  [INFO] |  +- commons-codec:commons-codec:jar:1.15:compile
                                                                                  [INFO] |  +- org.apache.httpcomponents:httpcore:jar:4.4.14:compile
                                                                                  [INFO] |  +- io.opencensus:opencensus-contrib-http-util:jar:0.28.0:compile
                                                                                  [INFO] |  +- io.grpc:grpc-core:jar:1.41.0:compile
                                                                                  [INFO] |  +- com.google.android:annotations:jar:4.1.1.4:runtime
                                                                                  [INFO] |  +- org.codehaus.mojo:animal-sniffer-annotations:jar:1.20:runtime
                                                                                  [INFO] |  +- io.perfmark:perfmark-api:jar:0.23.0:runtime
                                                                                  [INFO] |  +- com.google.cloud:google-cloud-core:jar:2.2.0:compile
                                                                                  [INFO] |  \- com.google.api.grpc:proto-google-iam-v1:jar:1.1.6:compile
                                                                                  [INFO] +- org.springframework.boot:spring-boot-starter:jar:2.5.2:compile
                                                                                  [INFO] |  +- org.springframework.boot:spring-boot:jar:2.5.2:compile
                                                                                  [INFO] |  +- org.springframework.boot:spring-boot-autoconfigure:jar:2.5.2:compile
                                                                                  [INFO] |  +- jakarta.annotation:jakarta.annotation-api:jar:1.3.5:compile
                                                                                  [INFO] |  +- org.springframework:spring-core:jar:5.3.8:compile
                                                                                  [INFO] |  |  \- org.springframework:spring-jcl:jar:5.3.8:compile
                                                                                  [INFO] |  \- org.yaml:snakeyaml:jar:1.28:compile
                                                                                  [INFO] +- org.springframework.boot:spring-boot-starter-test:jar:2.5.2:test
                                                                                  [INFO] |  +- org.springframework.boot:spring-boot-test:jar:2.5.2:test
                                                                                  [INFO] |  +- org.springframework.boot:spring-boot-test-autoconfigure:jar:2.5.2:test
                                                                                  [INFO] |  +- com.jayway.jsonpath:json-path:jar:2.5.0:test
                                                                                  [INFO] |  |  \- net.minidev:json-smart:jar:2.4.7:compile
                                                                                  [INFO] |  |     \- net.minidev:accessors-smart:jar:2.4.7:compile
                                                                                  [INFO] |  +- jakarta.xml.bind:jakarta.xml.bind-api:jar:2.3.3:compile
                                                                                  [INFO] |  |  \- jakarta.activation:jakarta.activation-api:jar:1.2.2:compile
                                                                                  [INFO] |  +- org.assertj:assertj-core:jar:3.19.0:test
                                                                                  [INFO] |  +- org.hamcrest:hamcrest:jar:2.2:test
                                                                                  [INFO] |  +- org.junit.jupiter:junit-jupiter:jar:5.7.2:test
                                                                                  [INFO] |  |  +- org.junit.jupiter:junit-jupiter-api:jar:5.7.2:test
                                                                                  [INFO] |  |  |  +- org.apiguardian:apiguardian-api:jar:1.1.0:test
                                                                                  [INFO] |  |  |  +- org.opentest4j:opentest4j:jar:1.2.0:test
                                                                                  [INFO] |  |  |  \- org.junit.platform:junit-platform-commons:jar:1.7.2:test
                                                                                  [INFO] |  |  +- org.junit.jupiter:junit-jupiter-params:jar:5.7.2:test
                                                                                  [INFO] |  |  \- org.junit.jupiter:junit-jupiter-engine:jar:5.7.2:test
                                                                                  [INFO] |  |     \- org.junit.platform:junit-platform-engine:jar:1.7.2:test
                                                                                  [INFO] |  +- org.mockito:mockito-core:jar:3.9.0:test
                                                                                  [INFO] |  |  +- net.bytebuddy:byte-buddy:jar:1.10.22:compile
                                                                                  [INFO] |  |  +- net.bytebuddy:byte-buddy-agent:jar:1.10.22:test
                                                                                  [INFO] |  |  \- org.objenesis:objenesis:jar:3.2:compile
                                                                                  [INFO] |  +- org.mockito:mockito-junit-jupiter:jar:3.9.0:test
                                                                                  [INFO] |  +- org.skyscreamer:jsonassert:jar:1.5.0:test
                                                                                  [INFO] |  |  \- com.vaadin.external.google:android-json:jar:0.0.20131108.vaadin1:test
                                                                                  [INFO] |  +- org.springframework:spring-test:jar:5.3.8:test
                                                                                  [INFO] |  \- org.xmlunit:xmlunit-core:jar:2.8.2:test
                                                                                  [INFO] +- org.springframework.boot:spring-boot-starter-thymeleaf:jar:2.5.2:compile
                                                                                  [INFO] |  +- org.thymeleaf:thymeleaf-spring5:jar:3.0.12.RELEASE:compile
                                                                                  [INFO] |  |  \- org.thymeleaf:thymeleaf:jar:3.0.12.RELEASE:compile
                                                                                  [INFO] |  |     +- org.attoparser:attoparser:jar:2.0.5.RELEASE:compile
                                                                                  [INFO] |  |     \- org.unbescape:unbescape:jar:1.1.6.RELEASE:compile
                                                                                  [INFO] |  \- org.thymeleaf.extras:thymeleaf-extras-java8time:jar:3.0.4.RELEASE:compile
                                                                                  [INFO] +- org.springframework:spring-webmvc:jar:5.3.8:compile
                                                                                  [INFO] |  +- org.springframework:spring-aop:jar:5.3.8:compile
                                                                                  [INFO] |  +- org.springframework:spring-beans:jar:5.3.8:compile
                                                                                  [INFO] |  +- org.springframework:spring-context:jar:5.3.8:compile
                                                                                  [INFO] |  +- org.springframework:spring-expression:jar:5.3.8:compile
                                                                                  [INFO] |  \- org.springframework:spring-web:jar:5.3.8:compile
                                                                                  [INFO] +- org.springframework.boot:spring-boot-starter-security:jar:2.5.2:compile
                                                                                  [INFO] |  +- org.springframework.security:spring-security-config:jar:5.5.1:compile
                                                                                  [INFO] |  |  \- org.springframework.security:spring-security-core:jar:5.5.1:compile
                                                                                  [INFO] |  |     \- org.springframework.security:spring-security-crypto:jar:5.5.1:compile
                                                                                  [INFO] |  \- org.springframework.security:spring-security-web:jar:5.5.1:compile
                                                                                  [INFO] +- org.springframework.data:spring-data-jpa:jar:2.5.2:compile
                                                                                  [INFO] |  +- org.springframework.data:spring-data-commons:jar:2.5.2:compile
                                                                                  [INFO] |  +- org.springframework:spring-orm:jar:5.3.8:compile
                                                                                  [INFO] |  |  \- org.springframework:spring-jdbc:jar:5.3.8:compile
                                                                                  [INFO] |  +- org.springframework:spring-tx:jar:5.3.8:compile
                                                                                  [INFO] |  +- org.aspectj:aspectjrt:jar:1.9.6:compile
                                                                                  [INFO] |  \- org.slf4j:slf4j-api:jar:1.7.31:compile
                                                                                  [INFO] +- org.springframework.boot:spring-boot-starter-data-jpa:jar:2.5.2:compile
                                                                                  [INFO] |  +- org.springframework.boot:spring-boot-starter-aop:jar:2.5.2:compile
                                                                                  [INFO] |  |  \- org.aspectj:aspectjweaver:jar:1.9.6:compile
                                                                                  [INFO] |  +- org.springframework.boot:spring-boot-starter-jdbc:jar:2.5.2:compile
                                                                                  [INFO] |  |  \- com.zaxxer:HikariCP:jar:4.0.3:compile
                                                                                  [INFO] |  +- jakarta.transaction:jakarta.transaction-api:jar:1.3.3:compile
                                                                                  [INFO] |  +- jakarta.persistence:jakarta.persistence-api:jar:2.2.3:compile
                                                                                  [INFO] |  +- org.hibernate:hibernate-core:jar:5.4.32.Final:compile
                                                                                  [INFO] |  |  +- org.jboss.logging:jboss-logging:jar:3.4.2.Final:compile
                                                                                  [INFO] |  |  +- org.javassist:javassist:jar:3.27.0-GA:compile
                                                                                  [INFO] |  |  +- antlr:antlr:jar:2.7.7:compile
                                                                                  [INFO] |  |  +- org.jboss:jandex:jar:2.2.3.Final:compile
                                                                                  [INFO] |  |  +- com.fasterxml:classmate:jar:1.5.1:compile
                                                                                  [INFO] |  |  +- org.dom4j:dom4j:jar:2.1.3:compile
                                                                                  [INFO] |  |  +- org.hibernate.common:hibernate-commons-annotations:jar:5.1.2.Final:compile
                                                                                  [INFO] |  |  \- org.glassfish.jaxb:jaxb-runtime:jar:2.3.4:compile
                                                                                  [INFO] |  |     +- org.glassfish.jaxb:txw2:jar:2.3.4:compile
                                                                                  [INFO] |  |     +- com.sun.istack:istack-commons-runtime:jar:3.0.12:compile
                                                                                  [INFO] |  |     \- com.sun.activation:jakarta.activation:jar:1.2.2:runtime
                                                                                  [INFO] |  \- org.springframework:spring-aspects:jar:5.3.8:compile
                                                                                  [INFO] +- org.projectlombok:lombok:jar:1.18.12:provided
                                                                                  [INFO] +- com.h2database:h2:jar:1.4.190:runtime
                                                                                  [INFO] +- org.springframework.boot:spring-boot-starter-web:jar:2.5.2:compile
                                                                                  [INFO] |  +- org.springframework.boot:spring-boot-starter-json:jar:2.5.2:compile
                                                                                  [INFO] |  |  +- com.fasterxml.jackson.datatype:jackson-datatype-jdk8:jar:2.12.3:compile
                                                                                  [INFO] |  |  +- com.fasterxml.jackson.datatype:jackson-datatype-jsr310:jar:2.12.3:compile
                                                                                  [INFO] |  |  \- com.fasterxml.jackson.module:jackson-module-parameter-names:jar:2.12.3:compile
                                                                                  [INFO] |  \- org.springframework.boot:spring-boot-starter-tomcat:jar:2.5.2:compile
                                                                                  [INFO] |     +- org.apache.tomcat.embed:tomcat-embed-core:jar:9.0.48:compile
                                                                                  [INFO] |     +- org.apache.tomcat.embed:tomcat-embed-el:jar:9.0.48:compile
                                                                                  [INFO] |     \- org.apache.tomcat.embed:tomcat-embed-websocket:jar:9.0.48:compile
                                                                                  [INFO] +- org.apache.httpcomponents:httpclient:jar:4.5.12:compile
                                                                                  [INFO] +- org.springframework.integration:spring-integration-core:jar:5.5.3:compile
                                                                                  [INFO] |  +- org.springframework:spring-messaging:jar:5.3.8:compile
                                                                                  [INFO] |  +- org.springframework.retry:spring-retry:jar:1.3.1:compile
                                                                                  [INFO] |  \- io.projectreactor:reactor-core:jar:3.4.7:compile
                                                                                  [INFO] |     \- org.reactivestreams:reactive-streams:jar:1.0.3:compile
                                                                                  [INFO] +- org.apache.commons:commons-text:jar:1.9:compile
                                                                                  [INFO] |  \- org.apache.commons:commons-lang3:jar:3.12.0:compile
                                                                                  [INFO] +- com.fasterxml.jackson.core:jackson-annotations:jar:2.12.3:compile
                                                                                  [INFO] +- com.fasterxml.jackson.core:jackson-core:jar:2.12.3:compile
                                                                                  [INFO] +- com.fasterxml.jackson.core:jackson-databind:jar:2.12.3:compile
                                                                                  [INFO] +- org.springframework.boot:spring-boot-starter-actuator:jar:2.5.2:compile
                                                                                  [INFO] |  +- org.springframework.boot:spring-boot-actuator-autoconfigure:jar:2.5.2:compile
                                                                                  [INFO] |  |  \- org.springframework.boot:spring-boot-actuator:jar:2.5.2:compile
                                                                                  [INFO] |  \- io.micrometer:micrometer-core:jar:1.7.1:compile
                                                                                  [INFO] |     +- org.hdrhistogram:HdrHistogram:jar:2.1.12:compile
                                                                                  [INFO] |     \- org.latencyutils:LatencyUtils:jar:2.0.3:runtime
                                                                                  [INFO] +- org.apache.maven.plugins:maven-compiler-plugin:jar:3.8.1:compile
                                                                                  [INFO] |  +- org.apache.maven:maven-plugin-api:jar:3.0:compile
                                                                                  [INFO] |  |  +- org.apache.maven:maven-model:jar:3.0:compile
                                                                                  [INFO] |  |  \- org.sonatype.sisu:sisu-inject-plexus:jar:1.4.2:compile
                                                                                  [INFO] |  |     \- org.sonatype.sisu:sisu-inject-bean:jar:1.4.2:compile
                                                                                  [INFO] |  |        \- org.sonatype.sisu:sisu-guice:jar:noaop:2.1.7:compile
                                                                                  [INFO] |  +- org.apache.maven:maven-artifact:jar:3.0:compile
                                                                                  [INFO] |  |  \- org.codehaus.plexus:plexus-utils:jar:2.0.4:compile
                                                                                  [INFO] |  +- org.apache.maven:maven-core:jar:3.0:compile
                                                                                  [INFO] |  |  +- org.apache.maven:maven-settings:jar:3.0:compile
                                                                                  [INFO] |  |  +- org.apache.maven:maven-settings-builder:jar:3.0:compile
                                                                                  [INFO] |  |  +- org.apache.maven:maven-repository-metadata:jar:3.0:compile
                                                                                  [INFO] |  |  +- org.apache.maven:maven-model-builder:jar:3.0:compile
                                                                                  [INFO] |  |  +- org.apache.maven:maven-aether-provider:jar:3.0:runtime
                                                                                  [INFO] |  |  +- org.sonatype.aether:aether-impl:jar:1.7:compile
                                                                                  [INFO] |  |  |  \- org.sonatype.aether:aether-spi:jar:1.7:compile
                                                                                  [INFO] |  |  +- org.sonatype.aether:aether-api:jar:1.7:compile
                                                                                  [INFO] |  |  +- org.sonatype.aether:aether-util:jar:1.7:compile
                                                                                  [INFO] |  |  +- org.codehaus.plexus:plexus-interpolation:jar:1.14:compile
                                                                                  [INFO] |  |  +- org.codehaus.plexus:plexus-classworlds:jar:2.2.3:compile
                                                                                  [INFO] |  |  +- org.codehaus.plexus:plexus-component-annotations:jar:1.5.5:compile
                                                                                  [INFO] |  |  \- org.sonatype.plexus:plexus-sec-dispatcher:jar:1.3:compile
                                                                                  [INFO] |  |     \- org.sonatype.plexus:plexus-cipher:jar:1.4:compile
                                                                                  [INFO] |  +- org.apache.maven.shared:maven-shared-utils:jar:3.2.1:compile
                                                                                  [INFO] |  |  \- commons-io:commons-io:jar:2.5:compile
                                                                                  [INFO] |  +- org.apache.maven.shared:maven-shared-incremental:jar:1.1:compile
                                                                                  [INFO] |  +- org.codehaus.plexus:plexus-java:jar:0.9.10:compile
                                                                                  [INFO] |  |  +- org.ow2.asm:asm:jar:6.2:compile
                                                                                  [INFO] |  |  \- com.thoughtworks.qdox:qdox:jar:2.0-M8:compile
                                                                                  [INFO] |  +- org.codehaus.plexus:plexus-compiler-api:jar:2.8.4:compile
                                                                                  [INFO] |  +- org.codehaus.plexus:plexus-compiler-manager:jar:2.8.4:compile
                                                                                  [INFO] |  \- org.codehaus.plexus:plexus-compiler-javac:jar:2.8.4:runtime
                                                                                  [INFO] +- org.postgresql:postgresql:jar:42.2.23:compile
                                                                                  [INFO] +- junit:junit:jar:4.12:test
                                                                                  [INFO] |  \- org.hamcrest:hamcrest-core:jar:2.2:test
                                                                                  [INFO] +- org.springframework.boot:spring-boot-loader:jar:2.5.6:compile
                                                                                  [INFO] +- com.google.cloud:google-cloud-dataproc:jar:2.2.2:compile
                                                                                  [INFO] |  \- com.google.api.grpc:proto-google-cloud-dataproc-v1:jar:2.2.2:compile
                                                                                  [INFO] +- mysql:mysql-connector-java:jar:8.0.25:compile
                                                                                  [INFO] +- com.google.cloud:google-cloud-bigquery:jar:2.3.3:compile
                                                                                  [INFO] |  +- com.google.cloud:google-cloud-core-http:jar:2.2.0:compile
                                                                                  [INFO] |  +- com.google.api-client:google-api-client:jar:1.32.2:compile
                                                                                  [INFO] |  +- com.google.oauth-client:google-oauth-client:jar:1.32.1:compile
                                                                                  [INFO] |  +- com.google.http-client:google-http-client-apache-v2:jar:1.40.1:compile
                                                                                  [INFO] |  +- com.google.http-client:google-http-client-appengine:jar:1.40.1:compile
                                                                                  [INFO] |  +- com.google.api:gax-httpjson:jar:0.91.1:compile
                                                                                  [INFO] |  +- com.google.http-client:google-http-client-jackson2:jar:1.40.1:compile
                                                                                  [INFO] |  +- org.checkerframework:checker-compat-qual:jar:2.5.5:compile
                                                                                  [INFO] |  \- com.google.apis:google-api-services-bigquery:jar:v2-rev20211017-1.32.1:compile
                                                                                  [INFO] +- org.apache.spark:spark-core_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  +- com.thoughtworks.paranamer:paranamer:jar:2.8:compile
                                                                                  [INFO] |  +- org.apache.avro:avro:jar:1.8.2:compile
                                                                                  [INFO] |  |  +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile
                                                                                  [INFO] |  |  +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile
                                                                                  [INFO] |  |  +- org.apache.commons:commons-compress:jar:1.8.1:compile
                                                                                  [INFO] |  |  \- org.tukaani:xz:jar:1.5:compile
                                                                                  [INFO] |  +- org.apache.avro:avro-mapred:jar:hadoop2:1.8.2:compile
                                                                                  [INFO] |  |  \- org.apache.avro:avro-ipc:jar:1.8.2:compile
                                                                                  [INFO] |  +- com.twitter:chill_2.12:jar:0.9.5:compile
                                                                                  [INFO] |  |  \- com.esotericsoftware:kryo-shaded:jar:4.0.2:compile
                                                                                  [INFO] |  |     \- com.esotericsoftware:minlog:jar:1.3.0:compile
                                                                                  [INFO] |  +- com.twitter:chill-java:jar:0.9.5:compile
                                                                                  [INFO] |  +- org.apache.xbean:xbean-asm7-shaded:jar:4.15:compile
                                                                                  [INFO] |  +- org.apache.hadoop:hadoop-client:jar:3.2.0:compile
                                                                                  [INFO] |  |  +- org.apache.hadoop:hadoop-common:jar:3.2.0:compile
                                                                                  [INFO] |  |  |  +- commons-cli:commons-cli:jar:1.2:compile
                                                                                  [INFO] |  |  |  +- commons-collections:commons-collections:jar:3.2.2:compile
                                                                                  [INFO] |  |  |  +- org.eclipse.jetty:jetty-servlet:jar:9.4.42.v20210604:compile
                                                                                  [INFO] |  |  |  |  +- org.eclipse.jetty:jetty-security:jar:9.4.42.v20210604:compile
                                                                                  [INFO] |  |  |  |  \- org.eclipse.jetty:jetty-util-ajax:jar:9.4.42.v20210604:compile
                                                                                  [INFO] |  |  |  +- javax.servlet.jsp:jsp-api:jar:2.1:runtime
                                                                                  [INFO] |  |  |  +- commons-beanutils:commons-beanutils:jar:1.9.3:compile
                                                                                  [INFO] |  |  |  +- org.apache.commons:commons-configuration2:jar:2.1.1:compile
                                                                                  [INFO] |  |  |  +- com.google.re2j:re2j:jar:1.1:compile
                                                                                  [INFO] |  |  |  +- org.apache.hadoop:hadoop-auth:jar:3.2.0:compile
                                                                                  [INFO] |  |  |  |  \- com.nimbusds:nimbus-jose-jwt:jar:9.10:compile
                                                                                  [INFO] |  |  |  |     \- com.github.stephenc.jcip:jcip-annotations:jar:1.0-1:compile
                                                                                  [INFO] |  |  |  +- org.apache.curator:curator-client:jar:2.12.0:compile
                                                                                  [INFO] |  |  |  +- org.apache.htrace:htrace-core4:jar:4.1.0-incubating:compile
                                                                                  [INFO] |  |  |  +- org.apache.kerby:kerb-simplekdc:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |  +- org.apache.kerby:kerb-client:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |  |  +- org.apache.kerby:kerby-config:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |  |  +- org.apache.kerby:kerb-core:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |  |  |  \- org.apache.kerby:kerby-pkix:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |  |  |     +- org.apache.kerby:kerby-asn1:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |  |  |     \- org.apache.kerby:kerby-util:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |  |  +- org.apache.kerby:kerb-common:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |  |  |  \- org.apache.kerby:kerb-crypto:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |  |  +- org.apache.kerby:kerb-util:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |  |  \- org.apache.kerby:token-provider:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |  \- org.apache.kerby:kerb-admin:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |     +- org.apache.kerby:kerb-server:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |     |  \- org.apache.kerby:kerb-identity:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  |     \- org.apache.kerby:kerby-xdr:jar:1.0.1:compile
                                                                                  [INFO] |  |  |  +- org.codehaus.woodstox:stax2-api:jar:3.1.4:compile
                                                                                  [INFO] |  |  |  +- com.fasterxml.woodstox:woodstox-core:jar:5.0.3:compile
                                                                                  [INFO] |  |  |  \- dnsjava:dnsjava:jar:2.1.7:compile
                                                                                  [INFO] |  |  +- org.apache.hadoop:hadoop-hdfs-client:jar:3.2.0:compile
                                                                                  [INFO] |  |  |  \- com.squareup.okhttp:okhttp:jar:2.7.5:compile
                                                                                  [INFO] |  |  |     \- com.squareup.okio:okio:jar:1.6.0:compile
                                                                                  [INFO] |  |  +- org.apache.hadoop:hadoop-yarn-api:jar:3.2.0:compile
                                                                                  [INFO] |  |  |  \- javax.xml.bind:jaxb-api:jar:2.3.1:compile
                                                                                  [INFO] |  |  |     \- javax.activation:javax.activation-api:jar:1.2.0:compile
                                                                                  [INFO] |  |  +- org.apache.hadoop:hadoop-yarn-client:jar:3.2.0:compile
                                                                                  [INFO] |  |  +- org.apache.hadoop:hadoop-mapreduce-client-core:jar:3.2.0:compile
                                                                                  [INFO] |  |  |  \- org.apache.hadoop:hadoop-yarn-common:jar:3.2.0:compile
                                                                                  [INFO] |  |  |     +- javax.servlet:javax.servlet-api:jar:4.0.1:compile
                                                                                  [INFO] |  |  |     +- org.eclipse.jetty:jetty-util:jar:9.4.42.v20210604:compile
                                                                                  [INFO] |  |  |     +- com.fasterxml.jackson.module:jackson-module-jaxb-annotations:jar:2.12.3:compile
                                                                                  [INFO] |  |  |     \- com.fasterxml.jackson.jaxrs:jackson-jaxrs-json-provider:jar:2.12.3:compile
                                                                                  [INFO] |  |  |        \- com.fasterxml.jackson.jaxrs:jackson-jaxrs-base:jar:2.12.3:compile
                                                                                  [INFO] |  |  +- org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:3.2.0:compile
                                                                                  [INFO] |  |  |  \- org.apache.hadoop:hadoop-mapreduce-client-common:jar:3.2.0:compile
                                                                                  [INFO] |  |  \- org.apache.hadoop:hadoop-annotations:jar:3.2.0:compile
                                                                                  [INFO] |  +- org.apache.spark:spark-launcher_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  +- org.apache.spark:spark-kvstore_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  |  \- org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile
                                                                                  [INFO] |  +- org.apache.spark:spark-network-common_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  +- org.apache.spark:spark-network-shuffle_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  +- org.apache.spark:spark-unsafe_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  +- javax.activation:activation:jar:1.1.1:compile
                                                                                  [INFO] |  +- org.apache.curator:curator-recipes:jar:2.13.0:compile
                                                                                  [INFO] |  |  \- org.apache.curator:curator-framework:jar:2.13.0:compile
                                                                                  [INFO] |  +- org.apache.zookeeper:zookeeper:jar:3.4.14:compile
                                                                                  [INFO] |  |  \- org.apache.yetus:audience-annotations:jar:0.5.0:compile
                                                                                  [INFO] |  +- jakarta.servlet:jakarta.servlet-api:jar:4.0.4:compile
                                                                                  [INFO] |  +- org.apache.commons:commons-math3:jar:3.4.1:compile
                                                                                  [INFO] |  +- org.slf4j:jul-to-slf4j:jar:1.7.31:compile
                                                                                  [INFO] |  +- org.slf4j:jcl-over-slf4j:jar:1.7.31:compile
                                                                                  [INFO] |  +- com.ning:compress-lzf:jar:1.0.3:compile
                                                                                  [INFO] |  +- org.xerial.snappy:snappy-java:jar:1.1.8.2:compile
                                                                                  [INFO] |  +- org.lz4:lz4-java:jar:1.7.1:compile
                                                                                  [INFO] |  +- com.github.luben:zstd-jni:jar:1.4.8-1:compile
                                                                                  [INFO] |  +- org.roaringbitmap:RoaringBitmap:jar:0.9.0:compile
                                                                                  [INFO] |  |  \- org.roaringbitmap:shims:jar:0.9.0:runtime
                                                                                  [INFO] |  +- commons-net:commons-net:jar:3.1:compile
                                                                                  [INFO] |  +- org.scala-lang.modules:scala-xml_2.12:jar:1.2.0:compile
                                                                                  [INFO] |  +- org.scala-lang:scala-library:jar:2.12.10:compile
                                                                                  [INFO] |  +- org.scala-lang:scala-reflect:jar:2.12.10:compile
                                                                                  [INFO] |  +- org.json4s:json4s-jackson_2.12:jar:3.7.0-M5:compile
                                                                                  [INFO] |  |  \- org.json4s:json4s-core_2.12:jar:3.7.0-M5:compile
                                                                                  [INFO] |  |     +- org.json4s:json4s-ast_2.12:jar:3.7.0-M5:compile
                                                                                  [INFO] |  |     \- org.json4s:json4s-scalap_2.12:jar:3.7.0-M5:compile
                                                                                  [INFO] |  +- org.glassfish.jersey.core:jersey-client:jar:2.33:compile
                                                                                  [INFO] |  |  +- jakarta.ws.rs:jakarta.ws.rs-api:jar:2.1.6:compile
                                                                                  [INFO] |  |  \- org.glassfish.hk2.external:jakarta.inject:jar:2.6.1:compile
                                                                                  [INFO] |  +- org.glassfish.jersey.core:jersey-common:jar:2.33:compile
                                                                                  [INFO] |  |  \- org.glassfish.hk2:osgi-resource-locator:jar:1.0.3:compile
                                                                                  [INFO] |  +- org.glassfish.jersey.core:jersey-server:jar:2.33:compile
                                                                                  [INFO] |  |  \- jakarta.validation:jakarta.validation-api:jar:2.0.2:compile
                                                                                  [INFO] |  +- org.glassfish.jersey.containers:jersey-container-servlet:jar:2.33:compile
                                                                                  [INFO] |  +- org.glassfish.jersey.containers:jersey-container-servlet-core:jar:2.33:compile
                                                                                  [INFO] |  +- org.glassfish.jersey.inject:jersey-hk2:jar:2.33:compile
                                                                                  [INFO] |  |  \- org.glassfish.hk2:hk2-locator:jar:2.6.1:compile
                                                                                  [INFO] |  |     +- org.glassfish.hk2.external:aopalliance-repackaged:jar:2.6.1:compile
                                                                                  [INFO] |  |     +- org.glassfish.hk2:hk2-api:jar:2.6.1:compile
                                                                                  [INFO] |  |     \- org.glassfish.hk2:hk2-utils:jar:2.6.1:compile
                                                                                  [INFO] |  +- io.netty:netty-all:jar:4.1.65.Final:compile
                                                                                  [INFO] |  +- com.clearspring.analytics:stream:jar:2.9.6:compile
                                                                                  [INFO] |  +- io.dropwizard.metrics:metrics-core:jar:4.1.24:compile
                                                                                  [INFO] |  +- io.dropwizard.metrics:metrics-jvm:jar:4.1.24:compile
                                                                                  [INFO] |  +- io.dropwizard.metrics:metrics-json:jar:4.1.24:compile
                                                                                  [INFO] |  +- io.dropwizard.metrics:metrics-graphite:jar:4.1.24:compile
                                                                                  [INFO] |  +- io.dropwizard.metrics:metrics-jmx:jar:4.1.24:compile
                                                                                  [INFO] |  +- com.fasterxml.jackson.module:jackson-module-scala_2.12:jar:2.12.3:compile
                                                                                  [INFO] |  +- org.apache.ivy:ivy:jar:2.4.0:compile
                                                                                  [INFO] |  +- oro:oro:jar:2.0.8:compile
                                                                                  [INFO] |  +- net.razorvine:pyrolite:jar:4.30:compile
                                                                                  [INFO] |  +- net.sf.py4j:py4j:jar:0.10.9:compile
                                                                                  [INFO] |  +- org.apache.spark:spark-tags_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  +- org.apache.commons:commons-crypto:jar:1.1.0:compile
                                                                                  [INFO] |  \- org.spark-project.spark:unused:jar:1.0.0:compile
                                                                                  [INFO] +- org.apache.spark:spark-streaming_2.12:jar:3.1.0:compile
                                                                                  [INFO] +- org.apache.spark:spark-streaming-kafka-0-10_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  +- org.apache.spark:spark-token-provider-kafka-0-10_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  \- org.apache.kafka:kafka-clients:jar:2.7.1:compile
                                                                                  [INFO] +- org.apache.spark:spark-avro_2.12:jar:3.1.0:compile
                                                                                  [INFO] +- org.apache.spark:spark-sql-kafka-0-10_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  \- org.apache.commons:commons-pool2:jar:2.9.0:compile
                                                                                  [INFO] +- org.codehaus.janino:janino:jar:3.0.8:compile
                                                                                  [INFO] +- org.codehaus.janino:commons-compiler:jar:3.0.8:compile
                                                                                  [INFO] +- org.apache.spark:spark-sql_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  +- com.univocity:univocity-parsers:jar:2.9.0:compile
                                                                                  [INFO] |  +- org.apache.spark:spark-sketch_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  +- org.apache.spark:spark-catalyst_2.12:jar:3.1.0:compile
                                                                                  [INFO] |  |  +- org.scala-lang.modules:scala-parser-combinators_2.12:jar:1.1.2:compile
                                                                                  [INFO] |  |  +- org.antlr:antlr4-runtime:jar:4.8-1:compile
                                                                                  [INFO] |  |  +- org.apache.arrow:arrow-vector:jar:2.0.0:compile
                                                                                  [INFO] |  |  |  +- org.apache.arrow:arrow-format:jar:2.0.0:compile
                                                                                  [INFO] |  |  |  +- org.apache.arrow:arrow-memory-core:jar:2.0.0:compile
                                                                                  [INFO] |  |  |  \- com.google.flatbuffers:flatbuffers-java:jar:1.9.0:compile
                                                                                  [INFO] |  |  \- org.apache.arrow:arrow-memory-netty:jar:2.0.0:compile
                                                                                  [INFO] |  +- org.apache.orc:orc-core:jar:1.5.12:compile
                                                                                  [INFO] |  |  +- org.apache.orc:orc-shims:jar:1.5.12:compile
                                                                                  [INFO] |  |  +- commons-lang:commons-lang:jar:2.6:compile
                                                                                  [INFO] |  |  +- io.airlift:aircompressor:jar:0.10:compile
                                                                                  [INFO] |  |  \- org.threeten:threeten-extra:jar:1.5.0:compile
                                                                                  [INFO] |  +- org.apache.orc:orc-mapreduce:jar:1.5.12:compile
                                                                                  [INFO] |  +- org.apache.hive:hive-storage-api:jar:2.7.2:compile
                                                                                  [INFO] |  +- org.apache.parquet:parquet-column:jar:1.10.1:compile
                                                                                  [INFO] |  |  +- org.apache.parquet:parquet-common:jar:1.10.1:compile
                                                                                  [INFO] |  |  \- org.apache.parquet:parquet-encoding:jar:1.10.1:compile
                                                                                  [INFO] |  \- org.apache.parquet:parquet-hadoop:jar:1.10.1:compile
                                                                                  [INFO] |     +- org.apache.parquet:parquet-format:jar:2.4.0:compile
                                                                                  [INFO] |     \- org.apache.parquet:parquet-jackson:jar:1.10.1:compile
                                                                                  [INFO] +- org.springframework.kafka:spring-kafka:jar:2.8.2:compile
                                                                                  [INFO] +- com.google.cloud:google-cloud-storage:jar:2.1.9:compile
                                                                                  [INFO] |  \- com.google.apis:google-api-services-storage:jar:v1-rev20210918-1.32.1:compile
                                                                                  [INFO] \- za.co.absa:abris_2.12:jar:6.0.0:compile
                                                                                  [INFO]    +- io.confluent:kafka-avro-serializer:jar:6.2.1:compile
                                                                                  [INFO]    |  +- io.confluent:kafka-schema-serializer:jar:6.2.1:compile
                                                                                  [INFO]    |  \- io.confluent:common-utils:jar:6.2.1:compile
                                                                                  [INFO]    +- io.confluent:kafka-schema-registry-client:jar:6.2.1:compile
                                                                                  [INFO]    |  +- io.swagger:swagger-annotations:jar:1.6.2:compile
                                                                                  [INFO]    |  \- io.swagger:swagger-core:jar:1.6.2:compile
                                                                                  [INFO]    |     +- com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:jar:2.12.3:compile
                                                                                  [INFO]    |     \- io.swagger:swagger-models:jar:1.6.2:compile
                                                                                  [INFO]    \- za.co.absa.commons:commons_2.12:jar:1.0.0:compile
                                                                                  

                                                                                  My Spark Integration with Spring boot is causing the issue, i am not able to dependency which is causing it

                                                                                  ANSWER

                                                                                  Answered 2022-Feb-16 at 13:12

                                                                                  Acording to this answer: https://stackoverflow.com/a/51236918/16651073 tomcat falls back to default logging if it can resolve the location

                                                                                  Can you try to save the properties without the spaces.

                                                                                  Like this: logging.file.name=application.logs

                                                                                  Source https://stackoverflow.com/questions/71142413

                                                                                  QUESTION

                                                                                  The Kafka topic is here, a Java consumer program finds it, but lists none of its content, while a kafka-console-consumer is able to
                                                                                  Asked 2022-Feb-16 at 13:23

                                                                                  It's my first Kafka program.

                                                                                  From a kafka_2.13-3.1.0 instance, I created a Kafka topic poids_garmin_brut and filled it with this csv:

                                                                                  kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic poids_garmin_brut
                                                                                  kafka-console-producer.sh --broker-list localhost:9092 --topic poids_garmin_brut < "Poids(1).csv"
                                                                                  
                                                                                  Durée,Poids,Variation,IMC,Masse grasse,Masse musculaire squelettique,Masse osseuse,Masse hydrique,
                                                                                  " 14 Fév. 2022",
                                                                                  06:37,72.1 kg,0.3 kg,22.8,26.3 %,29.7 kg,3.5 kg,53.8 %,
                                                                                  " 13 Fév. 2022",
                                                                                  06:48,72.4 kg,0.2 kg,22.9,25.4 %,29.8 kg,3.6 kg,54.4 %,
                                                                                  " 12 Fév. 2022",
                                                                                  06:17,72.2 kg,0.0 kg,22.8,25.3 %,29.7 kg,3.6 kg,54.5 %,
                                                                                  [...]
                                                                                  

                                                                                  And at anytime now, before or after running the program I'll show, its content can be displayed by a kafka-console-consumer command:

                                                                                  kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic poids_garmin_brut --from-beginning
                                                                                  Durée,Poids,Variation,IMC,Masse grasse,Masse musculaire squelettique,Masse osseuse,Masse hydrique,
                                                                                  " 14 Fév. 2022",
                                                                                  06:37,72.1 kg,0.3 kg,22.8,26.3 %,29.7 kg,3.5 kg,53.8 %,
                                                                                  " 13 Fév. 2022",
                                                                                  06:48,72.4 kg,0.2 kg,22.9,25.4 %,29.8 kg,3.6 kg,54.4 %,
                                                                                  " 12 Fév. 2022",
                                                                                  06:17,72.2 kg,0.0 kg,22.8,25.3 %,29.7 kg,3.6 kg,54.5 %,
                                                                                  " 11 Fév. 2022",
                                                                                  05:54,72.2 kg,0.1 kg,22.8,25.6 %,29.7 kg,3.5 kg,54.3 %,
                                                                                  " 10 Fév. 2022",
                                                                                  06:14,72.3 kg,0.0 kg,22.8,25.9 %,29.7 kg,3.5 kg,54.1 %,
                                                                                  " 9 Fév. 2022",
                                                                                  06:06,72.3 kg,0.5 kg,22.8,26.3 %,29.7 kg,3.5 kg,53.8 %,
                                                                                  " 8 Fév. 2022",
                                                                                  07:14,71.8 kg,0.7 kg,22.7,26.3 %,29.6 kg,3.5 kg,53.8 %,
                                                                                  

                                                                                  Here is the Java program, based on org.apache.kafka:kafka-streams:3.1.0 dependency, extracting this topic as a stream:

                                                                                  package extracteur.garmin;
                                                                                  
                                                                                  import org.apache.kafka.clients.consumer.ConsumerConfig;
                                                                                  import org.apache.kafka.common.serialization.Serdes;
                                                                                  import org.apache.kafka.streams.KafkaStreams;
                                                                                  import org.apache.kafka.streams.StreamsBuilder;
                                                                                  import org.apache.kafka.streams.StreamsConfig;
                                                                                  import org.apache.kafka.streams.kstream.KStream;
                                                                                  import org.slf4j.*;
                                                                                  
                                                                                  import org.springframework.boot.autoconfigure.SpringBootApplication;
                                                                                  
                                                                                  import java.util.Properties;
                                                                                  
                                                                                  @SpringBootApplication
                                                                                  public class Kafka {
                                                                                     /** Logger. */
                                                                                     private static final Logger LOGGER = LoggerFactory.getLogger(Kafka.class);
                                                                                  
                                                                                     public static void main(String[] args) {
                                                                                        LOGGER.info("L'extracteur de données Garmin démarre...");
                                                                                  
                                                                                        /* Les données du fichier CSV d'entrée sont sous cette forme :
                                                                                  
                                                                                           Durée,Poids,Variation,IMC,Masse grasse,Masse musculaire squelettique,Masse osseuse,Masse hydrique,
                                                                                           " 14 Fév. 2022",
                                                                                           06:37,72.1 kg,0.3 kg,22.8,26.3 %,29.7 kg,3.5 kg,53.8 %,
                                                                                           " 13 Fév. 2022",
                                                                                           06:48,72.4 kg,0.2 kg,22.9,25.4 %,29.8 kg,3.6 kg,54.4 %,
                                                                                         */
                                                                                  
                                                                                        // Création d'un flux sans clef et valeur : chaîne de caractères.
                                                                                        StreamsBuilder builder = new StreamsBuilder();
                                                                                        KStream stream = builder.stream("poids_garmin_brut");
                                                                                  
                                                                                        // C'est un foreach de Kafka, pas de lambda java. Il est lazy.
                                                                                        stream.foreach((key, value) -> {
                                                                                           LOGGER.info(value);
                                                                                        });
                                                                                  
                                                                                        KafkaStreams streams = new KafkaStreams(builder.build(), config());
                                                                                        streams.start();
                                                                                  
                                                                                        // Fermer le flux Kafka quand la VM s'arrêtera, en faisant appeler
                                                                                        streams.close();
                                                                                        Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
                                                                                     }
                                                                                  
                                                                                     /**
                                                                                      * Propriétés pour le démarrage.
                                                                                      * @return propriétés de configuration.
                                                                                      */
                                                                                     private static Properties config() {
                                                                                        Properties config = new Properties();
                                                                                        config.put(StreamsConfig.APPLICATION_ID_CONFIG, "dev1");
                                                                                        config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
                                                                                        config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
                                                                                        config.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.Void().getClass());
                                                                                        config.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());
                                                                                        return config;
                                                                                     }
                                                                                  }
                                                                                  

                                                                                  But, while the logs don't seem to report any error during execution, my program doesn't enter the stream.forEach, and therefore: displays no content from that topic.

                                                                                  (in this log I removed the dev1-d1c8ce47-6fbf-41b7-b8aa-e3d094703088- part of [dev1-d1c8ce47-6fbf-41b7-b8aa-e3d094703088-StreamThread-1] you should read inside, for SO message length and lisibility. And org.apache.kafka becames o.a.k.).

                                                                                  /usr/lib/jvm/java-1.11.0-openjdk-amd64/bin/java -XX:TieredStopAtLevel=1 -noverify -Dspring.output.ansi.enabled=always -Dcom.sun.management.jmxremote -Dspring.jmx.enabled=true -Dspring.liveBeansView.mbeanDomain -Dspring.application.admin.enabled=true -javaagent:/opt/idea-IU-212.5284.40/lib/idea_rt.jar=41397:/opt/idea-IU-212.5284.40/bin -Dfile.encoding=UTF-8 -classpath /home/lebihan/dev/Java/garmin/target/classes:/home/lebihan/.m2/repository/org/slf4j/slf4j-api/1.7.33/slf4j-api-1.7.33.jar:/home/lebihan/.m2/repository/org/slf4j/log4j-over-slf4j/1.7.33/log4j-over-slf4j-1.7.33.jar:/home/lebihan/.m2/repository/ch/qos/logback/logback-classic/1.2.10/logback-classic-1.2.10.jar:/home/lebihan/.m2/repository/ch/qos/logback/logback-core/1.2.10/logback-core-1.2.10.jar:/home/lebihan/.m2/repository/org/springframework/boot/spring-boot-starter-web/2.6.3/spring-boot-starter-web-2.6.3.jar:/home/lebihan/.m2/repository/org/springframework/boot/spring-boot-starter/2.6.3/spring-boot-starter-2.6.3.jar:/home/lebihan/.m2/repository/org/springframework/boot/spring-boot/2.6.3/spring-boot-2.6.3.jar:/home/lebihan/.m2/repository/org/springframework/boot/spring-boot-autoconfigure/2.6.3/spring-boot-autoconfigure-2.6.3.jar:/home/lebihan/.m2/repository/org/springframework/boot/spring-boot-starter-logging/2.6.3/spring-boot-starter-logging-2.6.3.jar:/home/lebihan/.m2/repository/org/apache/logging/log4j/log4j-to-slf4j/2.17.1/log4j-to-slf4j-2.17.1.jar:/home/lebihan/.m2/repository/org/apache/logging/log4j/log4j-api/2.17.1/log4j-api-2.17.1.jar:/home/lebihan/.m2/repository/org/slf4j/jul-to-slf4j/1.7.33/jul-to-slf4j-1.7.33.jar:/home/lebihan/.m2/repository/jakarta/annotation/jakarta.annotation-api/1.3.5/jakarta.annotation-api-1.3.5.jar:/home/lebihan/.m2/repository/org/yaml/snakeyaml/1.29/snakeyaml-1.29.jar:/home/lebihan/.m2/repository/org/springframework/boot/spring-boot-starter-json/2.6.3/spring-boot-starter-json-2.6.3.jar:/home/lebihan/.m2/repository/com/fasterxml/jackson/datatype/jackson-datatype-jdk8/2.13.1/jackson-datatype-jdk8-2.13.1.jar:/home/lebihan/.m2/repository/com/fasterxml/jackson/datatype/jackson-datatype-jsr310/2.13.1/jackson-datatype-jsr310-2.13.1.jar:/home/lebihan/.m2/repository/com/fasterxml/jackson/module/jackson-module-parameter-names/2.13.1/jackson-module-parameter-names-2.13.1.jar:/home/lebihan/.m2/repository/org/springframework/boot/spring-boot-starter-tomcat/2.6.3/spring-boot-starter-tomcat-2.6.3.jar:/home/lebihan/.m2/repository/org/apache/tomcat/embed/tomcat-embed-core/9.0.56/tomcat-embed-core-9.0.56.jar:/home/lebihan/.m2/repository/org/apache/tomcat/embed/tomcat-embed-el/9.0.56/tomcat-embed-el-9.0.56.jar:/home/lebihan/.m2/repository/org/apache/tomcat/embed/tomcat-embed-websocket/9.0.56/tomcat-embed-websocket-9.0.56.jar:/home/lebihan/.m2/repository/org/springframework/spring-web/5.3.15/spring-web-5.3.15.jar:/home/lebihan/.m2/repository/org/springframework/spring-beans/5.3.15/spring-beans-5.3.15.jar:/home/lebihan/.m2/repository/org/springframework/spring-webmvc/5.3.15/spring-webmvc-5.3.15.jar:/home/lebihan/.m2/repository/org/springframework/spring-aop/5.3.15/spring-aop-5.3.15.jar:/home/lebihan/.m2/repository/org/springframework/spring-context/5.3.15/spring-context-5.3.15.jar:/home/lebihan/.m2/repository/org/springframework/spring-expression/5.3.15/spring-expression-5.3.15.jar:/home/lebihan/.m2/repository/org/springframework/spring-core/5.3.15/spring-core-5.3.15.jar:/home/lebihan/.m2/repository/org/springframework/spring-jcl/5.3.15/spring-jcl-5.3.15.jar:/home/lebihan/.m2/repository/org/apache/kafka/kafka-streams/3.1.0/kafka-streams-3.1.0.jar:/home/lebihan/.m2/repository/org/apache/kafka/kafka-clients/3.0.0/kafka-clients-3.0.0.jar:/home/lebihan/.m2/repository/com/github/luben/zstd-jni/1.5.0-2/zstd-jni-1.5.0-2.jar:/home/lebihan/.m2/repository/org/lz4/lz4-java/1.7.1/lz4-java-1.7.1.jar:/home/lebihan/.m2/repository/org/xerial/snappy/snappy-java/1.1.8.1/snappy-java-1.1.8.1.jar:/home/lebihan/.m2/repository/org/rocksdb/rocksdbjni/6.22.1.1/rocksdbjni-6.22.1.1.jar:/home/lebihan/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.13.1/jackson-annotations-2.13.1.jar:/home/lebihan/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.13.1/jackson-databind-2.13.1.jar:/home/lebihan/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.13.1/jackson-core-2.13.1.jar extracteur.garmin.Kafka
                                                                                  07:57:49.720 [main] INFO extracteur.garmin.Kafka - L'extracteur de données Garmin démarre...
                                                                                  07:57:49.747 [main] INFO o.a.k.streams.StreamsConfig - StreamsConfig values: 
                                                                                      acceptable.recovery.lag = 10000
                                                                                      application.id = dev1
                                                                                      application.server = 
                                                                                      bootstrap.servers = [localhost:9092]
                                                                                      buffered.records.per.partition = 1000
                                                                                      built.in.metrics.version = latest
                                                                                      cache.max.bytes.buffering = 10485760
                                                                                      client.id = 
                                                                                      commit.interval.ms = 30000
                                                                                      connections.max.idle.ms = 540000
                                                                                      default.deserialization.exception.handler = class o.a.k.streams.errors.LogAndFailExceptionHandler
                                                                                      default.key.serde = class o.a.k.common.serialization.Serdes$VoidSerde
                                                                                      default.list.key.serde.inner = null
                                                                                      default.list.key.serde.type = null
                                                                                      default.list.value.serde.inner = null
                                                                                      default.list.value.serde.type = null
                                                                                      default.production.exception.handler = class o.a.k.streams.errors.DefaultProductionExceptionHandler
                                                                                      default.timestamp.extractor = class o.a.k.streams.processor.FailOnInvalidTimestamp
                                                                                      default.value.serde = class o.a.k.common.serialization.Serdes$StringSerde
                                                                                      max.task.idle.ms = 0
                                                                                      max.warmup.replicas = 2
                                                                                      metadata.max.age.ms = 300000
                                                                                      metric.reporters = []
                                                                                      metrics.num.samples = 2
                                                                                      metrics.recording.level = INFO
                                                                                      metrics.sample.window.ms = 30000
                                                                                      num.standby.replicas = 0
                                                                                      num.stream.threads = 1
                                                                                      poll.ms = 100
                                                                                      probing.rebalance.interval.ms = 600000
                                                                                      processing.guarantee = at_least_once
                                                                                      receive.buffer.bytes = 32768
                                                                                      reconnect.backoff.max.ms = 1000
                                                                                      reconnect.backoff.ms = 50
                                                                                      replication.factor = -1
                                                                                      request.timeout.ms = 40000
                                                                                      retries = 0
                                                                                      retry.backoff.ms = 100
                                                                                      rocksdb.config.setter = null
                                                                                      security.protocol = PLAINTEXT
                                                                                      send.buffer.bytes = 131072
                                                                                      state.cleanup.delay.ms = 600000
                                                                                      state.dir = /tmp/kafka-streams
                                                                                      task.timeout.ms = 300000
                                                                                      topology.optimization = none
                                                                                      upgrade.from = null
                                                                                      window.size.ms = null
                                                                                      windowed.inner.class.serde = null
                                                                                      windowstore.changelog.additional.retention.ms = 86400000
                                                                                  
                                                                                  07:57:49.760 [main] INFO o.a.k.clients.admin.AdminClientConfig - AdminClientConfig values: 
                                                                                      bootstrap.servers = [localhost:9092]
                                                                                      client.dns.lookup = use_all_dns_ips
                                                                                      client.id = admin
                                                                                      connections.max.idle.ms = 300000
                                                                                      default.api.timeout.ms = 60000
                                                                                      metadata.max.age.ms = 300000
                                                                                      metric.reporters = []
                                                                                      metrics.num.samples = 2
                                                                                      metrics.recording.level = INFO
                                                                                      metrics.sample.window.ms = 30000
                                                                                      receive.buffer.bytes = 65536
                                                                                      reconnect.backoff.max.ms = 1000
                                                                                      reconnect.backoff.ms = 50
                                                                                      request.timeout.ms = 30000
                                                                                      retries = 2147483647
                                                                                      retry.backoff.ms = 100
                                                                                      sasl.client.callback.handler.class = null
                                                                                      sasl.jaas.config = null
                                                                                      sasl.kerberos.kinit.cmd = /usr/bin/kinit
                                                                                      sasl.kerberos.min.time.before.relogin = 60000
                                                                                      sasl.kerberos.service.name = null
                                                                                      sasl.kerberos.ticket.renew.jitter = 0.05
                                                                                      sasl.kerberos.ticket.renew.window.factor = 0.8
                                                                                      sasl.login.callback.handler.class = null
                                                                                      sasl.login.class = null
                                                                                      sasl.login.refresh.buffer.seconds = 300
                                                                                      sasl.login.refresh.min.period.seconds = 60
                                                                                      sasl.login.refresh.window.factor = 0.8
                                                                                      sasl.login.refresh.window.jitter = 0.05
                                                                                      sasl.mechanism = GSSAPI
                                                                                      security.protocol = PLAINTEXT
                                                                                      security.providers = null
                                                                                      send.buffer.bytes = 131072
                                                                                      socket.connection.setup.timeout.max.ms = 30000
                                                                                      socket.connection.setup.timeout.ms = 10000
                                                                                      ssl.cipher.suites = null
                                                                                      ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
                                                                                      ssl.endpoint.identification.algorithm = https
                                                                                      ssl.engine.factory.class = null
                                                                                      ssl.key.password = null
                                                                                      ssl.keymanager.algorithm = SunX509
                                                                                      ssl.keystore.certificate.chain = null
                                                                                      ssl.keystore.key = null
                                                                                      ssl.keystore.location = null
                                                                                      ssl.keystore.password = null
                                                                                      ssl.keystore.type = JKS
                                                                                      ssl.protocol = TLSv1.3
                                                                                      ssl.provider = null
                                                                                      ssl.secure.random.implementation = null
                                                                                      ssl.trustmanager.algorithm = PKIX
                                                                                      ssl.truststore.certificates = null
                                                                                      ssl.truststore.location = null
                                                                                      ssl.truststore.password = null
                                                                                      ssl.truststore.type = JKS
                                                                                  
                                                                                  07:57:49.790 [main] INFO o.a.k.common.utils.AppInfoParser - Kafka version: 3.0.0
                                                                                  07:57:49.790 [main] INFO o.a.k.common.utils.AppInfoParser - Kafka commitId: 8cb0a5e9d3441962
                                                                                  07:57:49.790 [main] INFO o.a.k.common.utils.AppInfoParser - Kafka startTimeMs: 1644908269788
                                                                                  07:57:49.793 [main] INFO o.a.k.streams.KafkaStreams - stream-client [dev1-d1c8ce47-6fbf-41b7-b8aa-e3d094703088] Kafka Streams version: 3.1.0
                                                                                  07:57:49.793 [main] INFO o.a.k.streams.KafkaStreams - stream-client [dev1-d1c8ce47-6fbf-41b7-b8aa-e3d094703088] Kafka Streams commit ID: 37edeed0777bacb3
                                                                                  07:57:49.800 [main] INFO o.a.k.streams.processor.internals.StreamThread - stream-thread [StreamThread-1] Creating restore consumer client
                                                                                  07:57:49.802 [main] INFO o.a.k.clients.consumer.ConsumerConfig - ConsumerConfig values: 
                                                                                      allow.auto.create.topics = true
                                                                                      auto.commit.interval.ms = 5000
                                                                                      auto.offset.reset = none
                                                                                      bootstrap.servers = [localhost:9092]
                                                                                      check.crcs = true
                                                                                      client.dns.lookup = use_all_dns_ips
                                                                                      client.id = StreamThread-1-restore-consumer
                                                                                      client.rack = 
                                                                                      connections.max.idle.ms = 540000
                                                                                      default.api.timeout.ms = 60000
                                                                                      enable.auto.commit = false
                                                                                      exclude.internal.topics = true
                                                                                      fetch.max.bytes = 52428800
                                                                                      fetch.max.wait.ms = 500
                                                                                      fetch.min.bytes = 1
                                                                                      group.id = null
                                                                                      group.instance.id = null
                                                                                      heartbeat.interval.ms = 3000
                                                                                      interceptor.classes = []
                                                                                      internal.leave.group.on.close = false
                                                                                      internal.throw.on.fetch.stable.offset.unsupported = false
                                                                                      isolation.level = read_uncommitted
                                                                                      key.deserializer = class o.a.k.common.serialization.ByteArrayDeserializer
                                                                                      max.partition.fetch.bytes = 1048576
                                                                                      max.poll.interval.ms = 300000
                                                                                      max.poll.records = 1000
                                                                                      metadata.max.age.ms = 300000
                                                                                      metric.reporters = []
                                                                                      metrics.num.samples = 2
                                                                                      metrics.recording.level = INFO
                                                                                      metrics.sample.window.ms = 30000
                                                                                      partition.assignment.strategy = [class o.a.k.clients.consumer.RangeAssignor, class o.a.k.clients.consumer.CooperativeStickyAssignor]
                                                                                      receive.buffer.bytes = 65536
                                                                                      reconnect.backoff.max.ms = 1000
                                                                                      reconnect.backoff.ms = 50
                                                                                      request.timeout.ms = 30000
                                                                                      retry.backoff.ms = 100
                                                                                      sasl.client.callback.handler.class = null
                                                                                      sasl.jaas.config = null
                                                                                      sasl.kerberos.kinit.cmd = /usr/bin/kinit
                                                                                      sasl.kerberos.min.time.before.relogin = 60000
                                                                                      sasl.kerberos.service.name = null
                                                                                      sasl.kerberos.ticket.renew.jitter = 0.05
                                                                                      sasl.kerberos.ticket.renew.window.factor = 0.8
                                                                                      sasl.login.callback.handler.class = null
                                                                                      sasl.login.class = null
                                                                                      sasl.login.refresh.buffer.seconds = 300
                                                                                      sasl.login.refresh.min.period.seconds = 60
                                                                                      sasl.login.refresh.window.factor = 0.8
                                                                                      sasl.login.refresh.window.jitter = 0.05
                                                                                      sasl.mechanism = GSSAPI
                                                                                      security.protocol = PLAINTEXT
                                                                                      security.providers = null
                                                                                      send.buffer.bytes = 131072
                                                                                      session.timeout.ms = 45000
                                                                                      socket.connection.setup.timeout.max.ms = 30000
                                                                                      socket.connection.setup.timeout.ms = 10000
                                                                                      ssl.cipher.suites = null
                                                                                      ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
                                                                                      ssl.endpoint.identification.algorithm = https
                                                                                      ssl.engine.factory.class = null
                                                                                      ssl.key.password = null
                                                                                      ssl.keymanager.algorithm = SunX509
                                                                                      ssl.keystore.certificate.chain = null
                                                                                      ssl.keystore.key = null
                                                                                      ssl.keystore.location = null
                                                                                      ssl.keystore.password = null
                                                                                      ssl.keystore.type = JKS
                                                                                      ssl.protocol = TLSv1.3
                                                                                      ssl.provider = null
                                                                                      ssl.secure.random.implementation = null
                                                                                      ssl.trustmanager.algorithm = PKIX
                                                                                      ssl.truststore.certificates = null
                                                                                      ssl.truststore.location = null
                                                                                      ssl.truststore.password = null
                                                                                      ssl.truststore.type = JKS
                                                                                      value.deserializer = class o.a.k.common.serialization.ByteArrayDeserializer
                                                                                  
                                                                                  07:57:49.816 [main] INFO o.a.k.common.utils.AppInfoParser - Kafka version: 3.0.0
                                                                                  07:57:49.816 [main] INFO o.a.k.common.utils.AppInfoParser - Kafka commitId: 8cb0a5e9d3441962
                                                                                  07:57:49.816 [main] INFO o.a.k.common.utils.AppInfoParser - Kafka startTimeMs: 1644908269816
                                                                                  07:57:49.818 [main] INFO o.a.k.streams.processor.internals.StreamThread - stream-thread [StreamThread-1] Creating thread producer client
                                                                                  07:57:49.820 [main] INFO o.a.k.clients.producer.ProducerConfig - ProducerConfig values: 
                                                                                      acks = -1
                                                                                      batch.size = 16384
                                                                                      bootstrap.servers = [localhost:9092]
                                                                                      buffer.memory = 33554432
                                                                                      client.dns.lookup = use_all_dns_ips
                                                                                      client.id = StreamThread-1-producer
                                                                                      compression.type = none
                                                                                      connections.max.idle.ms = 540000
                                                                                      delivery.timeout.ms = 120000
                                                                                      enable.idempotence = true
                                                                                      interceptor.classes = []
                                                                                      key.serializer = class o.a.k.common.serialization.ByteArraySerializer
                                                                                      linger.ms = 100
                                                                                      max.block.ms = 60000
                                                                                      max.in.flight.requests.per.connection = 5
                                                                                      max.request.size = 1048576
                                                                                      metadata.max.age.ms = 300000
                                                                                      metadata.max.idle.ms = 300000
                                                                                      metric.reporters = []
                                                                                      metrics.num.samples = 2
                                                                                      metrics.recording.level = INFO
                                                                                      metrics.sample.window.ms = 30000
                                                                                      partitioner.class = class o.a.k.clients.producer.internals.DefaultPartitioner
                                                                                      receive.buffer.bytes = 32768
                                                                                      reconnect.backoff.max.ms = 1000
                                                                                      reconnect.backoff.ms = 50
                                                                                      request.timeout.ms = 30000
                                                                                      retries = 2147483647
                                                                                      retry.backoff.ms = 100
                                                                                      sasl.client.callback.handler.class = null
                                                                                      sasl.jaas.config = null
                                                                                      sasl.kerberos.kinit.cmd = /usr/bin/kinit
                                                                                      sasl.kerberos.min.time.before.relogin = 60000
                                                                                      sasl.kerberos.service.name = null
                                                                                      sasl.kerberos.ticket.renew.jitter = 0.05
                                                                                      sasl.kerberos.ticket.renew.window.factor = 0.8
                                                                                      sasl.login.callback.handler.class = null
                                                                                      sasl.login.class = null
                                                                                      sasl.login.refresh.buffer.seconds = 300
                                                                                      sasl.login.refresh.min.period.seconds = 60
                                                                                      sasl.login.refresh.window.factor = 0.8
                                                                                      sasl.login.refresh.window.jitter = 0.05
                                                                                      sasl.mechanism = GSSAPI
                                                                                      security.protocol = PLAINTEXT
                                                                                      security.providers = null
                                                                                      send.buffer.bytes = 131072
                                                                                      socket.connection.setup.timeout.max.ms = 30000
                                                                                      socket.connection.setup.timeout.ms = 10000
                                                                                      ssl.cipher.suites = null
                                                                                      ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
                                                                                      ssl.endpoint.identification.algorithm = https
                                                                                      ssl.engine.factory.class = null
                                                                                      ssl.key.password = null
                                                                                      ssl.keymanager.algorithm = SunX509
                                                                                      ssl.keystore.certificate.chain = null
                                                                                      ssl.keystore.key = null
                                                                                      ssl.keystore.location = null
                                                                                      ssl.keystore.password = null
                                                                                      ssl.keystore.type = JKS
                                                                                      ssl.protocol = TLSv1.3
                                                                                      ssl.provider = null
                                                                                      ssl.secure.random.implementation = null
                                                                                      ssl.trustmanager.algorithm = PKIX
                                                                                      ssl.truststore.certificates = null
                                                                                      ssl.truststore.location = null
                                                                                      ssl.truststore.password = null
                                                                                      ssl.truststore.type = JKS
                                                                                      transaction.timeout.ms = 60000
                                                                                      transactional.id = null
                                                                                      value.serializer = class o.a.k.common.serialization.ByteArraySerializer
                                                                                  
                                                                                  07:57:49.828 [main] INFO o.a.k.common.utils.AppInfoParser - Kafka version: 3.0.0
                                                                                  07:57:49.828 [main] INFO o.a.k.common.utils.AppInfoParser - Kafka commitId: 8cb0a5e9d3441962
                                                                                  07:57:49.828 [main] INFO o.a.k.common.utils.AppInfoParser - Kafka startTimeMs: 1644908269828
                                                                                  07:57:49.830 [main] INFO o.a.k.streams.processor.internals.StreamThread - stream-thread [StreamThread-1] Creating consumer client
                                                                                  07:57:49.831 [main] INFO o.a.k.clients.consumer.ConsumerConfig - ConsumerConfig values: 
                                                                                      allow.auto.create.topics = false
                                                                                      auto.commit.interval.ms = 5000
                                                                                      auto.offset.reset = earliest
                                                                                      bootstrap.servers = [localhost:9092]
                                                                                      check.crcs = true
                                                                                      client.dns.lookup = use_all_dns_ips
                                                                                      client.id = StreamThread-1-consumer
                                                                                      client.rack = 
                                                                                      connections.max.idle.ms = 540000
                                                                                      default.api.timeout.ms = 60000
                                                                                      enable.auto.commit = false
                                                                                      exclude.internal.topics = true
                                                                                      fetch.max.bytes = 52428800
                                                                                      fetch.max.wait.ms = 500
                                                                                      fetch.min.bytes = 1
                                                                                      group.id = dev1
                                                                                      group.instance.id = null
                                                                                      heartbeat.interval.ms = 3000
                                                                                      interceptor.classes = []
                                                                                      internal.leave.group.on.close = false
                                                                                      internal.throw.on.fetch.stable.offset.unsupported = false
                                                                                      isolation.level = read_uncommitted
                                                                                      key.deserializer = class o.a.k.common.serialization.ByteArrayDeserializer
                                                                                      max.partition.fetch.bytes = 1048576
                                                                                      max.poll.interval.ms = 300000
                                                                                      max.poll.records = 1000
                                                                                      metadata.max.age.ms = 300000
                                                                                      metric.reporters = []
                                                                                      metrics.num.samples = 2
                                                                                      metrics.recording.level = INFO
                                                                                      metrics.sample.window.ms = 30000
                                                                                      partition.assignment.strategy = [o.a.k.streams.processor.internals.StreamsPartitionAssignor]
                                                                                      receive.buffer.bytes = 65536
                                                                                      reconnect.backoff.max.ms = 1000
                                                                                      reconnect.backoff.ms = 50
                                                                                      request.timeout.ms = 30000
                                                                                      retry.backoff.ms = 100
                                                                                      sasl.client.callback.handler.class = null
                                                                                      sasl.jaas.config = null
                                                                                      sasl.kerberos.kinit.cmd = /usr/bin/kinit
                                                                                      sasl.kerberos.min.time.before.relogin = 60000
                                                                                      sasl.kerberos.service.name = null
                                                                                      sasl.kerberos.ticket.renew.jitter = 0.05
                                                                                      sasl.kerberos.ticket.renew.window.factor = 0.8
                                                                                      sasl.login.callback.handler.class = null
                                                                                      sasl.login.class = null
                                                                                      sasl.login.refresh.buffer.seconds = 300
                                                                                      sasl.login.refresh.min.period.seconds = 60
                                                                                      sasl.login.refresh.window.factor = 0.8
                                                                                      sasl.login.refresh.window.jitter = 0.05
                                                                                      sasl.mechanism = GSSAPI
                                                                                      security.protocol = PLAINTEXT
                                                                                      security.providers = null
                                                                                      send.buffer.bytes = 131072
                                                                                      session.timeout.ms = 45000
                                                                                      socket.connection.setup.timeout.max.ms = 30000
                                                                                      socket.connection.setup.timeout.ms = 10000
                                                                                      ssl.cipher.suites = null
                                                                                      ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
                                                                                      ssl.endpoint.identification.algorithm = https
                                                                                      ssl.engine.factory.class = null
                                                                                      ssl.key.password = null
                                                                                      ssl.keymanager.algorithm = SunX509
                                                                                      ssl.keystore.certificate.chain = null
                                                                                      ssl.keystore.key = null
                                                                                      ssl.keystore.location = null
                                                                                      ssl.keystore.password = null
                                                                                      ssl.keystore.type = JKS
                                                                                      ssl.protocol = TLSv1.3
                                                                                      ssl.provider = null
                                                                                      ssl.secure.random.implementation = null
                                                                                      ssl.trustmanager.algorithm = PKIX
                                                                                      ssl.truststore.certificates = null
                                                                                      ssl.truststore.location = null
                                                                                      ssl.truststore.password = null
                                                                                      ssl.truststore.type = JKS
                                                                                      value.deserializer = class o.a.k.common.serialization.ByteArrayDeserializer
                                                                                  
                                                                                      replication.factor = -1
                                                                                      windowstore.changelog.additional.retention.ms = 86400000
                                                                                  07:57:49.836 [main] INFO o.a.k.streams.processor.internals.assignment.AssignorConfiguration - stream-thread [StreamThread-1-consumer] Cooperative rebalancing protocol is enabled now
                                                                                  07:57:49.840 [main] INFO o.a.k.common.utils.AppInfoParser - Kafka version: 3.0.0
                                                                                  07:57:49.840 [main] INFO o.a.k.common.utils.AppInfoParser - Kafka commitId: 8cb0a5e9d3441962
                                                                                  07:57:49.840 [main] INFO o.a.k.common.utils.AppInfoParser - Kafka startTimeMs: 1644908269840
                                                                                  07:57:49.844 [main] INFO o.a.k.streams.KafkaStreams - stream-client [dev1-d1c8ce47-6fbf-41b7-b8aa-e3d094703088] State transition from CREATED to REBALANCING
                                                                                  07:57:49.845 [StreamThread-1] INFO o.a.k.streams.processor.internals.StreamThread - stream-thread [StreamThread-1] Starting
                                                                                  07:57:49.845 [StreamThread-1] INFO o.a.k.streams.processor.internals.StreamThread - stream-thread [StreamThread-1] State transition from CREATED to STARTING
                                                                                  07:57:49.845 [StreamThread-1] INFO o.a.k.clients.consumer.KafkaConsumer - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] Subscribed to topic(s): poids_garmin_brut
                                                                                  07:57:49.845 [main] INFO o.a.k.streams.KafkaStreams - stream-client [dev1-d1c8ce47-6fbf-41b7-b8aa-e3d094703088] State transition from REBALANCING to PENDING_SHUTDOWN
                                                                                  07:57:49.846 [kafka-streams-close-thread] INFO o.a.k.streams.processor.internals.StreamThread - stream-thread [StreamThread-1] Informed to shut down
                                                                                  07:57:49.846 [kafka-streams-close-thread] INFO o.a.k.streams.processor.internals.StreamThread - stream-thread [StreamThread-1] State transition from STARTING to PENDING_SHUTDOWN
                                                                                  07:57:49.919 [kafka-producer-network-thread | StreamThread-1-producer] INFO o.a.k.clients.Metadata - [Producer clientId=StreamThread-1-producer] Cluster ID: QKJGs4glRAy7besZxXNCrg
                                                                                  07:57:49.920 [StreamThread-1] INFO o.a.k.clients.Metadata - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] Cluster ID: QKJGs4glRAy7besZxXNCrg
                                                                                  07:57:49.921 [StreamThread-1] INFO o.a.k.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] Discovered group coordinator debian:9092 (id: 2147483647 rack: null)
                                                                                  07:57:49.922 [StreamThread-1] INFO o.a.k.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] (Re-)joining group
                                                                                  07:57:49.929 [StreamThread-1] INFO o.a.k.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] Request joining group due to: need to re-join with the given member-id
                                                                                  07:57:49.929 [StreamThread-1] INFO o.a.k.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] (Re-)joining group
                                                                                  07:57:49.930 [StreamThread-1] INFO o.a.k.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] Successfully joined group with generation Generation{generationId=3, memberId='StreamThread-1-consumer-34c0df37-baeb-4582-bdfe-79ab9e2e410c', protocol='stream'}
                                                                                  07:57:49.936 [StreamThread-1] INFO o.a.k.streams.processor.internals.StreamsPartitionAssignor - stream-thread [StreamThread-1-consumer] All members participating in this rebalance: 
                                                                                  d1c8ce47-6fbf-41b7-b8aa-e3d094703088: [StreamThread-1-consumer-34c0df37-baeb-4582-bdfe-79ab9e2e410c].
                                                                                  07:57:49.938 [StreamThread-1] INFO o.a.k.streams.processor.internals.assignment.HighAvailabilityTaskAssignor - Decided on assignment: {d1c8ce47-6fbf-41b7-b8aa-e3d094703088=[activeTasks: ([0_0]) standbyTasks: ([]) prevActiveTasks: ([]) prevStandbyTasks: ([]) changelogOffsetTotalsByTask: ([]) taskLagTotals: ([]) capacity: 1 assigned: 1]} with no followup probing rebalance.
                                                                                  07:57:49.938 [StreamThread-1] INFO o.a.k.streams.processor.internals.StreamsPartitionAssignor - stream-thread [StreamThread-1-consumer] Assigned tasks [0_0] including stateful [] to clients as: 
                                                                                  d1c8ce47-6fbf-41b7-b8aa-e3d094703088=[activeTasks: ([0_0]) standbyTasks: ([])].
                                                                                  07:57:49.939 [StreamThread-1] INFO o.a.k.streams.processor.internals.StreamsPartitionAssignor - stream-thread [StreamThread-1-consumer] Client d1c8ce47-6fbf-41b7-b8aa-e3d094703088 per-consumer assignment:
                                                                                      prev owned active {}
                                                                                      prev owned standby {StreamThread-1-consumer-34c0df37-baeb-4582-bdfe-79ab9e2e410c=[]}
                                                                                      assigned active {StreamThread-1-consumer-34c0df37-baeb-4582-bdfe-79ab9e2e410c=[0_0]}
                                                                                      revoking active {}
                                                                                      assigned standby {}
                                                                                  
                                                                                  07:57:49.939 [StreamThread-1] INFO o.a.k.streams.processor.internals.StreamsPartitionAssignor - stream-thread [StreamThread-1-consumer] Finished stable assignment of tasks, no followup rebalances required.
                                                                                  07:57:49.939 [StreamThread-1] INFO o.a.k.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] Finished assignment for group at generation 3: {StreamThread-1-consumer-34c0df37-baeb-4582-bdfe-79ab9e2e410c=Assignment(partitions=[poids_garmin_brut-0], userDataSize=52)}
                                                                                  07:57:49.943 [StreamThread-1] INFO o.a.k.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] Successfully synced group in generation Generation{generationId=3, memberId='StreamThread-1-consumer-34c0df37-baeb-4582-bdfe-79ab9e2e410c', protocol='stream'}
                                                                                  07:57:49.943 [StreamThread-1] INFO o.a.k.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] Updating assignment with
                                                                                      Assigned partitions:                       [poids_garmin_brut-0]
                                                                                      Current owned partitions:                  []
                                                                                      Added partitions (assigned - owned):       [poids_garmin_brut-0]
                                                                                      Revoked partitions (owned - assigned):     []
                                                                                  
                                                                                  07:57:49.943 [StreamThread-1] INFO o.a.k.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] Notifying assignor about the new Assignment(partitions=[poids_garmin_brut-0], userDataSize=52)
                                                                                  07:57:49.944 [StreamThread-1] INFO o.a.k.streams.processor.internals.StreamsPartitionAssignor - stream-thread [StreamThread-1-consumer] No followup rebalance was requested, resetting the rebalance schedule.
                                                                                  07:57:49.944 [StreamThread-1] INFO o.a.k.streams.processor.internals.TaskManager - stream-thread [StreamThread-1] Handle new assignment with:
                                                                                      New active tasks: [0_0]
                                                                                      New standby tasks: []
                                                                                      Existing active tasks: []
                                                                                      Existing standby tasks: []
                                                                                  07:57:49.950 [StreamThread-1] INFO o.a.k.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] Adding newly assigned partitions: poids_garmin_brut-0
                                                                                  07:57:49.953 [StreamThread-1] INFO o.a.k.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=StreamThread-1-consumer, groupId=dev1] Found no committed offset for partition poids_garmin_brut-0
                                                                                  07:57:49.954 [StreamThread-1] INFO o.a.k.streams.processor.internals.StreamThread - stream-thread [StreamThread-1] Shutting down
                                                                                  [...]
                                                                                  
                                                                                  Process finished with exit code 0
                                                                                  

                                                                                  What am I doing wrong?

                                                                                  • I'm running my Kafka instance and its Java program locally, on the same PC.

                                                                                  • I've experienced 3.1.0 and 2.8.1 versions of Kafka, or removed any traces of Spring in the Java program without success.

                                                                                  I belive I'm facing a configuration problem.

                                                                                  ANSWER

                                                                                  Answered 2022-Feb-15 at 14:36

                                                                                  Following should work.

                                                                                      LOGGER.info("L'extracteur de données Garmin démarre...");
                                                                                  
                                                                                      /* Les données du fichier CSV d'entrée sont sous cette forme :
                                                                                  
                                                                                       Durée,Poids,Variation,IMC,Masse grasse,Masse musculaire squelettique,Masse osseuse,Masse hydrique,
                                                                                       " 14 Fév. 2022",
                                                                                       06:37,72.1 kg,0.3 kg,22.8,26.3 %,29.7 kg,3.5 kg,53.8 %,
                                                                                       " 13 Fév. 2022",
                                                                                       06:48,72.4 kg,0.2 kg,22.9,25.4 %,29.8 kg,3.6 kg,54.4 %,
                                                                                     */
                                                                                  
                                                                                      // Création d'un flux sans clef et valeur : chaîne de caractères.
                                                                                      StreamsBuilder builder = new StreamsBuilder();
                                                                                      builder.stream("poids_garmin_brut")
                                                                                              .foreach((k, v) -> {
                                                                                                  LOGGER.info(v.toString());
                                                                                              });
                                                                                  
                                                                                      KafkaStreams streams = new KafkaStreams(builder.build(), config());
                                                                                      streams.start();
                                                                                  
                                                                                      // Fermer le flux Kafka quand la VM s'arrêtera, en faisant appeler
                                                                                      //streams.close();
                                                                                      Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
                                                                                  

                                                                                  OUTPUT

                                                                                  2022-02-15 20:05:54 INFO  ConsumerCoordinator:291 - [Consumer clientId=dev1-5e3fab76-51c7-41b5-aedf-99a4a071589b-StreamThread-1-consumer, groupId=dev1] Adding newly assigned partitions: poids_garmin_brut-0
                                                                                  2022-02-15 20:05:54 INFO  StreamThread:229 - stream-thread [dev1-5e3fab76-51c7-41b5-aedf-99a4a071589b-StreamThread-1] State transition from STARTING to PARTITIONS_ASSIGNED
                                                                                  2022-02-15 20:05:54 INFO  ConsumerCoordinator:844 - [Consumer clientId=dev1-5e3fab76-51c7-41b5-aedf-99a4a071589b-StreamThread-1-consumer, groupId=dev1] Setting offset for partition poids_garmin_brut-0 to the committed offset FetchPosition{offset=21, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[LAPTOP-J1JBHQUR:9092 (id: 0 rack: null)], epoch=0}}
                                                                                  2022-02-15 20:05:54 INFO  StreamTask:240 - stream-thread [dev1-5e3fab76-51c7-41b5-aedf-99a4a071589b-StreamThread-1] task [0_0] Initialized
                                                                                  2022-02-15 20:05:54 INFO  StreamTask:265 - stream-thread [dev1-5e3fab76-51c7-41b5-aedf-99a4a071589b-StreamThread-1] task [0_0] Restored and ready to run
                                                                                  2022-02-15 20:05:54 INFO  StreamThread:882 - stream-thread [dev1-5e3fab76-51c7-41b5-aedf-99a4a071589b-StreamThread-1] Restoration took 30 ms for all tasks [0_0]
                                                                                  2022-02-15 20:05:54 INFO  StreamThread:229 - stream-thread [dev1-5e3fab76-51c7-41b5-aedf-99a4a071589b-StreamThread-1] State transition from PARTITIONS_ASSIGNED to RUNNING
                                                                                  2022-02-15 20:05:54 INFO  KafkaStreams:332 - stream-client [dev1-5e3fab76-51c7-41b5-aedf-99a4a071589b] State transition from REBALANCING to RUNNING
                                                                                  2022-02-15 20:05:54 INFO  KafkaConsumer:2254 - [Consumer clientId=dev1-5e3fab76-51c7-41b5-aedf-99a4a071589b-StreamThread-1-consumer, groupId=dev1] Requesting the log end offset for poids_garmin_brut-0 in order to compute lag
                                                                                  2022-02-15 20:06:03 INFO  Main:33 - Test22
                                                                                  2022-02-15 20:06:06 INFO  Main:33 - Test23
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/71122596

                                                                                  QUESTION

                                                                                  Error when running Pytest with DeltaTables
                                                                                  Asked 2022-Feb-14 at 10:18

                                                                                  I am working in the VDI of a company and they use their own artifactory for security reasons. Currently I am writing unit tests to perform tests for a function that deletes entries from a delta table. When I started, I received an error of unresolved dependencies, because my spark session was configured in a way that it would load jars from maven. I was able to solve this issue by loading these jars locally from /opt/spark/jars. Now my code looks like this:

                                                                                  class TestTransformation(unittest.TestCase):
                                                                                      @classmethod
                                                                                      def test_ksu_deletion(self):
                                                                                          self.spark = SparkSession.builder\
                                                                                                          .appName('SPARK_DELETION')\
                                                                                                          .config("spark.delta.logStore.class", "org.apache.spark.sql.delta.storage.S3SingleDriverLogStore")\
                                                                                                          .config("spark.jars", "/opt/spark/jars/delta-core_2.12-0.7.0.jar, /opt/spark/jars/hadoop-aws-3.2.0.jar")\
                                                                                                          .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension")\
                                                                                                          .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")\
                                                                                                          .getOrCreate()
                                                                                          os.environ["KSU_DELETION_OBJECT"]="UNITTEST/"
                                                                                          deltatable = DeltaTable.forPath(self.spark, "/projects/some/path/snappy.parquet")
                                                                                          deltatable.delete(col("DATE") < get_current()
                                                                                  

                                                                                  However, I am getting the error message:

                                                                                  E     py4j.protocol.Py4JJavaError: An error occurred while calling z:io.delta.tables.DeltaTable.forPath.
                                                                                  E     : java.lang.NoSuchMethodError: org.apache.spark.sql.AnalysisException.(Ljava/lang/String;Lscala/Option;Lscala/Option;Lscala/Option;Lscala/Option;)V
                                                                                  

                                                                                  Do you have any idea by what this is caused? I am assuming it has to do with the way I am configuring spark.sql.extions and/or the spark.sql.catalog, but to be honest, I am quite a newb in Spark. I would greatly appreciate any hint.

                                                                                  Thanks a lot in advance!

                                                                                  Edit: We are using Spark 3.0.2 (Scala 2.12.10). According to https://docs.delta.io/latest/releases.html, this should be compatible. Apart from the SparkSession, I trimmed down the subsequent code to

                                                                                  df.spark.read.parquet(Path/to/file.snappy.parquet)
                                                                                  

                                                                                  and now I am getting the error message

                                                                                  java.lang.IncompatibleClassChangeError: class org.apache.spark.sql.catalyst.plans.logical.DeltaDelete has interface org.apache.spark.sql.catalyst.plans.logical.UnaryNode as super class
                                                                                  

                                                                                  As I said, I am quite new to (Py)Spark, so please dont hesitate to mention things you consider completely obvious.

                                                                                  Edit 2: I checked the Python path I am exporting in the Shell before running the code and I can see the following: Could this cause any problem? I dont understand why I do not get this error when running the code within pipenv (with spark-submit)

                                                                                  ANSWER

                                                                                  Answered 2022-Feb-14 at 10:18

                                                                                  It looks like that you're using incompatible version of the Delta lake library. 0.7.0 was for Spark 3.0, but you're using another version - either lower, or higher. Consult Delta releases page to find mapping between Delta version & required Spark versions.

                                                                                  If you're using Spark 3.1 or 3.2, consider using delta-spark Python package that will install all necessary dependencies, so you just import DeltaTable class.

                                                                                  Update: Yes, this happens because of the conflicting versions - you need to remove delta-spark and pyspark Python package, and install pyspark==3.0.2 explicitly.

                                                                                  P.S. Also, look onto pytest-spark package that can simplify specification of configuration for all tests. You can find examples of it + Delta here.

                                                                                  Source https://stackoverflow.com/questions/71084507

                                                                                  QUESTION

                                                                                  How can I have nice file names & efficient storage usage in my Foundry Magritte dataset export?
                                                                                  Asked 2022-Feb-10 at 05:12

                                                                                  I'm working on exporting data from Foundry datasets in parquet format using various Magritte export tasks to an ABFS system (but the same issue occurs with SFTP, S3, HDFS, and other file based exports).

                                                                                  The datasets I'm exporting are relatively small, under 512 MB in size, which means they don't really need to be split across multiple parquet files, and putting all the data in one file is enough. I've done this by ending the previous transform with a .coalesce(1) to get all of the data in a single file.

                                                                                  The issues are:

                                                                                  • By default the file name is part-0000-.snappy.parquet, with a different rid on every build. This means that, whenever a new file is uploaded, it appears in the same folder as an additional file, the only way to tell which is the newest version is by last modified date.
                                                                                  • Every version of the data is stored in my external system, this takes up unnecessary storage unless I frequently go in and delete old files.

                                                                                  All of this is unnecessary complexity being added to my downstream system, I just want to be able to pull the latest version of data in a single step.

                                                                                  ANSWER

                                                                                  Answered 2022-Jan-13 at 15:27

                                                                                  This is possible by renaming the single parquet file in the dataset so that it always has the same file name, that way the export task will overwrite the previous file in the external system.

                                                                                  This can be done using raw file system access. The write_single_named_parquet_file function below validates its inputs, creates a file with a given name in the output dataset, then copies the file in the input dataset to it. The result is a schemaless output dataset that contains a single named parquet file.

                                                                                  Notes

                                                                                  • The build will fail if the input contains more than one parquet file, as pointed out in the question, calling .coalesce(1) (or .repartition(1)) is necessary in the upstream transform
                                                                                  • If you require transaction history in your external store, or your dataset is much larger than 512 MB this method is not appropriate, as only the latest version is kept, and you likely want multiple parquet files for use in your downstream system. The createTransactionFolders (put each new export in a different folder) and flagFile (create a flag file once all files have been written) options can be useful in this case.
                                                                                  • The transform does not require any spark executors, so it is possible to use @configure() to give it a driver only profile. Giving the driver additional memory should fix out of memory errors when working with larger datasets.
                                                                                  • shutil.copyfileobj is used because the 'files' that are opened are actually just file objects.

                                                                                  Full code snippet

                                                                                  example_transform.py

                                                                                  from transforms.api import transform, Input, Output
                                                                                  import .utils
                                                                                  
                                                                                  
                                                                                  @transform(
                                                                                      output=Output("/path/to/output"),
                                                                                      source_df=Input("/path/to/input"),
                                                                                  )
                                                                                  def compute(output, source_df):
                                                                                      return utils.write_single_named_parquet_file(output, source_df, "readable_file_name")
                                                                                  

                                                                                  utils.py

                                                                                  from transforms.api import Input, Output
                                                                                  import shutil
                                                                                  import logging
                                                                                  
                                                                                  log = logging.getLogger(__name__)
                                                                                  
                                                                                  
                                                                                  def write_single_named_parquet_file(output: Output, input: Input, file_name: str):
                                                                                      """Write a single ".snappy.parquet" file with a given file name to a transforms output, containing the data of the
                                                                                      single ".snappy.parquet" file in the transforms input.  This is useful when you need to export the data using
                                                                                      magritte, wanting a human readable name in the output, when not using separate transaction folders this should cause
                                                                                      the previous output to be automatically overwritten.
                                                                                  
                                                                                      The input to this function must contain a single ".snappy.parquet" file, this can be achieved by calling
                                                                                      `.coalesce(1)` or `.repartition(1)` on your dataframe at the end of the upstream transform that produces the input.
                                                                                  
                                                                                      This function should not be used for large dataframes (e.g. those greater than 512 mb in size), instead
                                                                                      transaction folders should be enabled in the export.  This function can work for larger sizes, but you may find you
                                                                                      need additional driver memory to perform both the coalesce/repartition in the upstream transform, and here.
                                                                                  
                                                                                      This produces a dataset without a schema, so features like expectations can't be used.
                                                                                  
                                                                                      Parameters:
                                                                                          output (Output): The transforms output to write the single custom named ".snappy.parquet" file to, this is
                                                                                              the dataset you want to export
                                                                                          input (Input): The transforms input containing the data to be written to output, this must contain only one
                                                                                              ".snappy.parquet" file (it can contain other files, for example logs)
                                                                                          file_name: The name of the file to be written, if the ".snappy.parquet" will be automatically appended if not
                                                                                              already there, and ".snappy" and ".parquet" will be corrected to ".snappy.parquet"
                                                                                  
                                                                                      Raises:
                                                                                          RuntimeError: Input dataset must be coalesced or repartitioned into a single file.
                                                                                          RuntimeError: Input dataset file system cannot be empty.
                                                                                  
                                                                                      Returns:
                                                                                          void: writes the response to output, no return value
                                                                                      """
                                                                                      output.set_mode("replace")  # Make sure it is snapshotting
                                                                                  
                                                                                      input_files_df = input.filesystem().files()  # Get all files
                                                                                      input_files = [row[0] for row in input_files_df.collect()]  # noqa - first column in files_df is path
                                                                                      input_files = [f for f in input_files if f.endswith(".snappy.parquet")]  # filter non parquet files
                                                                                      if len(input_files) > 1:
                                                                                          raise RuntimeError("Input dataset must be coalesced or repartitioned into a single file.")
                                                                                      if len(input_files) == 0:
                                                                                          raise RuntimeError("Input dataset file system cannot be empty.")
                                                                                      input_file_path = input_files[0]
                                                                                  
                                                                                      log.info("Inital output file name: " + file_name)
                                                                                      # check for snappy.parquet and append if needed
                                                                                      if file_name.endswith(".snappy.parquet"):
                                                                                          pass  # if it is already correct, do nothing
                                                                                      elif file_name.endswith(".parquet"):
                                                                                          # if it ends with ".parquet" (and not ".snappy.parquet"), remove parquet and append ".snappy.parquet"
                                                                                          file_name = file_name.removesuffix(".parquet") + ".snappy.parquet"
                                                                                      elif file_name.endswith(".snappy"):
                                                                                          # if it ends with just ".snappy" then append ".parquet"
                                                                                          file_name = file_name + ".parquet"
                                                                                      else:
                                                                                          # if doesn't end with any of the above, add ".snappy.parquet"
                                                                                          file_name = file_name + ".snappy.parquet"
                                                                                      log.info("Final output file name: " + file_name)
                                                                                  
                                                                                      with input.filesystem().open(input_file_path, "rb") as in_f:  # open the input file
                                                                                          with output.filesystem().open(file_name, "wb") as out_f:  # open the output file
                                                                                              shutil.copyfileobj(in_f, out_f)  # write the file into a new file
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/70652943

                                                                                  QUESTION

                                                                                  Upserts on Delta simply duplicates data?
                                                                                  Asked 2022-Feb-07 at 07:22

                                                                                  I'm fairly new with Delta and lakehouse on databricks. I have some questions, based on the following actions:

                                                                                  • I import some parquet files
                                                                                  • Convert them to delta (creating 1 snappy.parquet file)
                                                                                  • Delete one random row (creating 1 new snappy.parquet file).
                                                                                  • I check content of both snappy files (version 0 of delta table, and version1), and they both contain all of the data, each one with it's specific differences.

                                                                                  Does this mean delta simply duplicates data for every new version?

                                                                                  How is this scalable? or am I missing something?

                                                                                  ANSWER

                                                                                  Answered 2022-Feb-07 at 07:22

                                                                                  Yes, that's how Delta lake works - when you're doing modification of the data, it won't write only delta, but takes the original file that is affected by change, make changes, and write it back. But take into account that not all data is duplicated - only that were in the file where affected rows are. For example, you have 3 data files, and you're making changes to some rows that are in the 2nd file. In this case, Delta will create a new file with number 4 that contains necessary changes + the rest of data from file 2, so you will have following versions:

                                                                                  • Version 0: files 1, 2 & 3
                                                                                  • Version 1: files, 1, 3 & 4

                                                                                  Source https://stackoverflow.com/questions/71010769

                                                                                  QUESTION

                                                                                  OWL API NoSuchMethodError in saveOntology() call
                                                                                  Asked 2022-Jan-31 at 10:43

                                                                                  I am trying to call an OWL API java program through terminal and it crashes, while the exact same code is running ok when I run it in IntelliJ.

                                                                                  The exception that rises in my main code is this:

                                                                                  Exception in thread "main" java.lang.NoSuchMethodError: 'boolean org.semanticweb.owlapi.io.RDFResource.idRequiredForIndividualOrAxiom()'
                                                                                          at org.semanticweb.owlapi.rdf.rdfxml.renderer.RDFXMLRenderer.render(RDFXMLRenderer.java:204)
                                                                                          at org.semanticweb.owlapi.rdf.RDFRendererBase.render(RDFRendererBase.java:448)
                                                                                          at org.semanticweb.owlapi.rdf.RDFRendererBase.renderOntologyHeader(RDFRendererBase.java:441)
                                                                                          at org.semanticweb.owlapi.rdf.RDFRendererBase.render(RDFRendererBase.java:247)
                                                                                          at org.semanticweb.owlapi.rdf.rdfxml.renderer.RDFXMLStorer.storeOntology(RDFXMLStorer.java:51)
                                                                                          at org.semanticweb.owlapi.util.AbstractOWLStorer.storeOntology(AbstractOWLStorer.java:142)
                                                                                          at org.semanticweb.owlapi.util.AbstractOWLStorer.storeOntology(AbstractOWLStorer.java:106)
                                                                                          at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.saveOntology(OWLOntologyManagerImpl.java:1347)
                                                                                          at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.saveOntology(OWLOntologyManagerImpl.java:1333)
                                                                                          at com.stelios.JavaExplanations.main(JavaExplanations.java:112)
                                                                                  
                                                                                  

                                                                                  It seems as if calling idRequiredForIndividualOrAxiom() on an RDFResource object doesn't find the method that is inherited by RDFNode class, but I have no clue why.

                                                                                  In order to post here, I kept only the saveOntology call in a minimal example and the exception that is thrown is the same with extra steps:

                                                                                  Exception in thread "main" java.lang.NoSuchMethodError: 'boolean org.semanticweb.owlapi.io.RDFResource.idRequiredForIndividualOrAxiom()'
                                                                                          at org.semanticweb.owlapi.rdf.rdfxml.renderer.RDFXMLRenderer.render(RDFXMLRenderer.java:204)
                                                                                          at org.semanticweb.owlapi.rdf.rdfxml.renderer.RDFXMLRenderer.render(RDFXMLRenderer.java:249)
                                                                                          at org.semanticweb.owlapi.rdf.RDFRendererBase.renderEntity(RDFRendererBase.java:298)
                                                                                          at org.semanticweb.owlapi.rdf.RDFRendererBase.render(RDFRendererBase.java:292)
                                                                                          at org.semanticweb.owlapi.rdf.RDFRendererBase.lambda$renderEntities$6(RDFRendererBase.java:285)
                                                                                          at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
                                                                                          at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
                                                                                          at java.base/java.util.ArrayList$Itr.forEachRemaining(ArrayList.java:1033)
                                                                                          at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
                                                                                          at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
                                                                                          at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
                                                                                          at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
                                                                                          at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
                                                                                          at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
                                                                                          at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
                                                                                          at org.semanticweb.owlapi.rdf.RDFRendererBase.renderEntities(RDFRendererBase.java:285)
                                                                                          at org.semanticweb.owlapi.rdf.RDFRendererBase.renderInOntologySignatureEntities(RDFRendererBase.java:269)
                                                                                          at org.semanticweb.owlapi.rdf.RDFRendererBase.renderOntologyComponents(RDFRendererBase.java:253)
                                                                                          at org.semanticweb.owlapi.rdf.RDFRendererBase.render(RDFRendererBase.java:248)
                                                                                          at org.semanticweb.owlapi.rdf.rdfxml.renderer.RDFXMLStorer.storeOntology(RDFXMLStorer.java:51)
                                                                                          at org.semanticweb.owlapi.util.AbstractOWLStorer.storeOntology(AbstractOWLStorer.java:142)
                                                                                          at org.semanticweb.owlapi.util.AbstractOWLStorer.storeOntology(AbstractOWLStorer.java:106)
                                                                                          at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.saveOntology(OWLOntologyManagerImpl.java:1347)
                                                                                          at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.saveOntology(OWLOntologyManagerImpl.java:1333)
                                                                                          at com.stelios.JavaExplanations.main(JavaExplanations.java:47)
                                                                                  

                                                                                  In both my original code and the minimal example I call java with: java -cp /home/stelios/java_explanations/target/java_explanations-1.0-SNAPSHOT-jar-with-dependencies.jar com.stelios.JavaExplanations

                                                                                  Here is the minimal example that repeats this behavior for me. This is the Java code:

                                                                                  package com.stelios;
                                                                                  
                                                                                  import java.io.File;
                                                                                  import java.io.FileNotFoundException;
                                                                                  import java.io.FileOutputStream;
                                                                                  import java.util.*;
                                                                                  
                                                                                  import org.semanticweb.owlapi.apibinding.OWLManager;
                                                                                  import org.semanticweb.owlapi.io.*;
                                                                                  import org.semanticweb.owlapi.model.*;
                                                                                  
                                                                                  public class JavaExplanations {
                                                                                      public static void main(String[] args) throws OWLOntologyCreationException, FileNotFoundException, OWLOntologyStorageException {
                                                                                          String ontology1 = "/home/stelios/Desktop/huiyfgds/ONTO_ASRTD_hz162pai";
                                                                                          String ontology2 = "/home/stelios/Desktop/huiyfgds/ONTO_INFRD_hz162pai";
                                                                                  
                                                                                          OWLOntologyManager ontology_manager = OWLManager.createOWLOntologyManager();
                                                                                          OWLOntology asserted_ontology = ontology_manager.loadOntologyFromOntologyDocument(new File(ontology1));
                                                                                          ontology_manager.saveOntology(asserted_ontology, new StreamDocumentTarget(new FileOutputStream(ontology2)));
                                                                                      }
                                                                                  }
                                                                                  

                                                                                  This is the pom.xml in IntelliJ:

                                                                                  
                                                                                  
                                                                                      4.0.0
                                                                                  
                                                                                      com.stelios.expl
                                                                                      java_explanations
                                                                                      1.0-SNAPSHOT
                                                                                  
                                                                                      
                                                                                          11
                                                                                          11
                                                                                      
                                                                                  
                                                                                      
                                                                                          
                                                                                          
                                                                                              net.sourceforge.owlapi
                                                                                              owlexplanation
                                                                                              5.0.0
                                                                                          
                                                                                          
                                                                                              net.sourceforge.owlapi
                                                                                              owlapi-distribution
                                                                                              5.1.9
                                                                                          
                                                                                          
                                                                                              net.sourceforge.owlapi
                                                                                              org.semanticweb.hermit
                                                                                              1.4.5.519
                                                                                          
                                                                                  
                                                                                          
                                                                                              org.slf4j
                                                                                              slf4j-api
                                                                                              1.7.32
                                                                                          
                                                                                          
                                                                                              org.slf4j
                                                                                              slf4j-nop
                                                                                              1.7.32
                                                                                          
                                                                                      
                                                                                  
                                                                                  
                                                                                      
                                                                                          
                                                                                              
                                                                                                  org.apache.maven.plugins
                                                                                                  maven-jar-plugin
                                                                                                  
                                                                                                      
                                                                                                          src/main/resources/META-INF/MANIFEST.MF
                                                                                                      
                                                                                                  
                                                                                              
                                                                                              
                                                                                                  maven-assembly-plugin
                                                                                                  
                                                                                                      
                                                                                                          package
                                                                                                          
                                                                                                              single
                                                                                                          
                                                                                                      
                                                                                                  
                                                                                                  
                                                                                                      
                                                                                                          jar-with-dependencies
                                                                                                      
                                                                                                  
                                                                                              
                                                                                          
                                                                                  
                                                                                          
                                                                                              
                                                                                                  src/main/java
                                                                                                  
                                                                                                      **/*.java
                                                                                                  
                                                                                              
                                                                                          
                                                                                      
                                                                                      
                                                                                  
                                                                                  

                                                                                  I think that most probably it is some dependency/version error but I don't see how this can be. I package everything I need in the jar file I give as classpath, defining the wanted versions in pom.xml, and in this jar I can find only one org/semanticweb/owlapi/io/RDFResource.class file.

                                                                                  Reading this and this I thought about having 2 different versions of OWL API, as I had another .jar with OWL API version 3.4.9 in it, in the directory tree. I moved the file and rebuilt the maven package just to be sure, and (as expected) no change.

                                                                                  Other than the saveOntology() call, my original program is working as intended.

                                                                                  The only thing out of the ordinary is that IntelliJ is giving me a Plugin 'maven-assembly-plugin:' not found problem, which I haven't managed to solve in any way, and have been ignoring as it hasn't been an issue in any of the operations I have needed. (If you know how to solve it of course, give me suggestions, but my main problem is the earlier mentioned exception).

                                                                                  EDIT Here is the mvn dependency:tree output.

                                                                                  [INFO] Scanning for projects...
                                                                                  [INFO] 
                                                                                  [INFO] -----------------< com.stelios.expl:java_explanations >-----------------
                                                                                  [INFO] Building java_explanations 1.0-SNAPSHOT
                                                                                  [INFO] --------------------------------[ jar ]---------------------------------
                                                                                  [INFO] 
                                                                                  [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ java_explanations ---
                                                                                  [INFO] com.stelios.expl:java_explanations:jar:1.0-SNAPSHOT
                                                                                  [INFO] +- net.sourceforge.owlapi:owlexplanation:jar:5.0.0:compile
                                                                                  [INFO] |  +- net.sourceforge.owlapi:owlapi-api:jar:5.1.19:compile (version selected from constraint [5.0.0,5.9.9])
                                                                                  [INFO] |  |  \- javax.inject:javax.inject:jar:1:compile
                                                                                  [INFO] |  +- net.sourceforge.owlapi:owlapi-tools:jar:5.1.19:compile (version selected from constraint [5.0.0,5.9.9])
                                                                                  [INFO] |  \- net.sourceforge.owlapi:telemetry:jar:5.0.0:compile
                                                                                  [INFO] |     \- net.sourceforge.owlapi:owlapi-parsers:jar:5.1.19:compile (version selected from constraint [5.0.0,5.9.9])
                                                                                  [INFO] +- net.sourceforge.owlapi:owlapi-distribution:jar:5.1.9:compile
                                                                                  [INFO] |  +- net.sourceforge.owlapi:owlapi-compatibility:jar:5.1.9:compile
                                                                                  [INFO] |  |  \- net.sourceforge.owlapi:owlapi-apibinding:jar:5.1.9:compile
                                                                                  [INFO] |  |     +- net.sourceforge.owlapi:owlapi-impl:jar:5.1.9:compile
                                                                                  [INFO] |  |     +- net.sourceforge.owlapi:owlapi-oboformat:jar:5.1.9:compile
                                                                                  [INFO] |  |     \- net.sourceforge.owlapi:owlapi-rio:jar:5.1.9:compile
                                                                                  [INFO] |  +- com.fasterxml.jackson.core:jackson-core:jar:2.9.7:compile
                                                                                  [INFO] |  +- com.fasterxml.jackson.core:jackson-databind:jar:2.9.7:compile
                                                                                  [INFO] |  +- com.fasterxml.jackson.core:jackson-annotations:jar:2.9.7:compile
                                                                                  [INFO] |  +- org.apache.commons:commons-rdf-api:jar:0.5.0:compile
                                                                                  [INFO] |  +- org.tukaani:xz:jar:1.6:compile
                                                                                  [INFO] |  +- org.slf4j:jcl-over-slf4j:jar:1.7.22:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-model:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-api:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-languages:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-datatypes:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-binary:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-n3:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-nquads:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-ntriples:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-rdfjson:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-jsonld:jar:2.3.2:compile
                                                                                  [INFO] |  |  +- org.apache.httpcomponents:httpclient:jar:4.5.2:compile
                                                                                  [INFO] |  |  |  \- org.apache.httpcomponents:httpcore:jar:4.4.4:compile
                                                                                  [INFO] |  |  \- org.apache.httpcomponents:httpclient-cache:jar:4.5.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-rdfxml:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-trix:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-turtle:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-rio-trig:jar:2.3.2:compile
                                                                                  [INFO] |  +- org.eclipse.rdf4j:rdf4j-util:jar:2.3.2:compile
                                                                                  [INFO] |  +- com.github.jsonld-java:jsonld-java:jar:0.12.0:compile
                                                                                  [INFO] |  |  +- org.apache.httpcomponents:httpclient-osgi:jar:4.5.5:compile
                                                                                  [INFO] |  |  |  +- org.apache.httpcomponents:httpmime:jar:4.5.5:compile
                                                                                  [INFO] |  |  |  \- org.apache.httpcomponents:fluent-hc:jar:4.5.5:compile
                                                                                  [INFO] |  |  \- org.apache.httpcomponents:httpcore-osgi:jar:4.4.9:compile
                                                                                  [INFO] |  |     \- org.apache.httpcomponents:httpcore-nio:jar:4.4.9:compile
                                                                                  [INFO] |  +- com.github.vsonnier:hppcrt:jar:0.7.5:compile
                                                                                  [INFO] |  +- com.github.ben-manes.caffeine:caffeine:jar:2.6.1:compile
                                                                                  [INFO] |  +- com.google.guava:guava:jar:22.0:compile (version selected from constraint [18.0,22.0])
                                                                                  [INFO] |  |  +- com.google.errorprone:error_prone_annotations:jar:2.0.18:compile
                                                                                  [INFO] |  |  +- com.google.j2objc:j2objc-annotations:jar:1.1:compile
                                                                                  [INFO] |  |  \- org.codehaus.mojo:animal-sniffer-annotations:jar:1.14:compile
                                                                                  [INFO] |  +- com.google.code.findbugs:jsr305:jar:3.0.2:compile (version selected from constraint [2.0.0,4))
                                                                                  [INFO] |  \- commons-io:commons-io:jar:2.5:compile
                                                                                  [INFO] +- net.sourceforge.owlapi:org.semanticweb.hermit:jar:1.4.5.519:compile
                                                                                  [INFO] |  +- commons-logging:commons-logging:jar:1.1.3:compile
                                                                                  [INFO] |  +- org.apache.ws.commons.axiom:axiom-api:jar:1.2.14:compile
                                                                                  [INFO] |  |  +- org.apache.geronimo.specs:geronimo-activation_1.1_spec:jar:1.1:compile
                                                                                  [INFO] |  |  +- org.apache.geronimo.specs:geronimo-javamail_1.4_spec:jar:1.7.1:compile
                                                                                  [INFO] |  |  +- jaxen:jaxen:jar:1.1.4:compile
                                                                                  [INFO] |  |  +- org.apache.geronimo.specs:geronimo-stax-api_1.0_spec:jar:1.0.1:compile
                                                                                  [INFO] |  |  \- org.apache.james:apache-mime4j-core:jar:0.7.2:compile
                                                                                  [INFO] |  +- org.apache.ws.commons.axiom:axiom-c14n:jar:1.2.14:compile
                                                                                  [INFO] |  +- org.apache.ws.commons.axiom:axiom-impl:jar:1.2.14:compile
                                                                                  [INFO] |  |  \- org.codehaus.woodstox:woodstox-core-asl:jar:4.1.4:compile
                                                                                  [INFO] |  |     \- org.codehaus.woodstox:stax2-api:jar:3.1.1:compile
                                                                                  [INFO] |  +- org.apache.ws.commons.axiom:axiom-dom:jar:1.2.14:compile
                                                                                  [INFO] |  +- dk.brics.automaton:automaton:jar:1.11-8:compile
                                                                                  [INFO] |  +- gnu.getopt:java-getopt:jar:1.0.13:compile
                                                                                  [INFO] |  \- net.sf.trove4j:trove4j:jar:3.0.3:compile
                                                                                  [INFO] +- org.slf4j:slf4j-api:jar:1.7.22:compile
                                                                                  [INFO] +- org.slf4j:slf4j-nop:jar:1.7.32:compile
                                                                                  [INFO] \- org.apache.maven.plugins:maven-assembly-plugin:maven-plugin:3.3.0:compile
                                                                                  [INFO]    +- org.apache.maven:maven-plugin-api:jar:3.0:compile
                                                                                  [INFO]    |  \- org.sonatype.sisu:sisu-inject-plexus:jar:1.4.2:compile
                                                                                  [INFO]    |     \- org.sonatype.sisu:sisu-inject-bean:jar:1.4.2:compile
                                                                                  [INFO]    |        \- org.sonatype.sisu:sisu-guice:jar:noaop:2.1.7:compile
                                                                                  [INFO]    +- org.apache.maven:maven-core:jar:3.0:compile
                                                                                  [INFO]    |  +- org.apache.maven:maven-settings:jar:3.0:compile
                                                                                  [INFO]    |  +- org.apache.maven:maven-settings-builder:jar:3.0:compile
                                                                                  [INFO]    |  +- org.apache.maven:maven-repository-metadata:jar:3.0:compile
                                                                                  [INFO]    |  +- org.apache.maven:maven-model-builder:jar:3.0:compile
                                                                                  [INFO]    |  +- org.apache.maven:maven-aether-provider:jar:3.0:runtime
                                                                                  [INFO]    |  +- org.sonatype.aether:aether-impl:jar:1.7:compile
                                                                                  [INFO]    |  |  \- org.sonatype.aether:aether-spi:jar:1.7:compile
                                                                                  [INFO]    |  +- org.sonatype.aether:aether-api:jar:1.7:compile
                                                                                  [INFO]    |  +- org.sonatype.aether:aether-util:jar:1.7:compile
                                                                                  [INFO]    |  +- org.codehaus.plexus:plexus-classworlds:jar:2.2.3:compile
                                                                                  [INFO]    |  +- org.codehaus.plexus:plexus-component-annotations:jar:1.5.5:compile
                                                                                  [INFO]    |  \- org.sonatype.plexus:plexus-sec-dispatcher:jar:1.3:compile
                                                                                  [INFO]    |     \- org.sonatype.plexus:plexus-cipher:jar:1.4:compile
                                                                                  [INFO]    +- org.apache.maven:maven-artifact:jar:3.0:compile
                                                                                  [INFO]    +- org.apache.maven:maven-model:jar:3.0:compile
                                                                                  [INFO]    +- org.apache.maven.shared:maven-common-artifact-filters:jar:3.1.0:compile
                                                                                  [INFO]    |  \- org.apache.maven.shared:maven-shared-utils:jar:3.1.0:compile
                                                                                  [INFO]    +- org.apache.maven.shared:maven-artifact-transfer:jar:0.11.0:compile
                                                                                  [INFO]    +- org.codehaus.plexus:plexus-interpolation:jar:1.25:compile
                                                                                  [INFO]    +- org.codehaus.plexus:plexus-archiver:jar:4.2.1:compile
                                                                                  [INFO]    |  +- org.apache.commons:commons-compress:jar:1.19:compile
                                                                                  [INFO]    |  \- org.iq80.snappy:snappy:jar:0.4:compile
                                                                                  [INFO]    +- org.apache.maven.shared:file-management:jar:3.0.0:compile
                                                                                  [INFO]    +- org.apache.maven.shared:maven-shared-io:jar:3.0.0:compile
                                                                                  [INFO]    |  +- org.apache.maven:maven-compat:jar:3.0:compile
                                                                                  [INFO]    |  \- org.apache.maven.wagon:wagon-provider-api:jar:2.10:compile
                                                                                  [INFO]    +- org.apache.maven.shared:maven-filtering:jar:3.1.1:compile
                                                                                  [INFO]    |  \- org.sonatype.plexus:plexus-build-api:jar:0.0.7:compile
                                                                                  [INFO]    +- org.codehaus.plexus:plexus-io:jar:3.2.0:compile
                                                                                  [INFO]    +- org.apache.maven:maven-archiver:jar:3.5.0:compile
                                                                                  [INFO]    +- org.codehaus.plexus:plexus-utils:jar:3.3.0:compile
                                                                                  [INFO]    \- commons-codec:commons-codec:jar:1.6:compile
                                                                                  [INFO] ------------------------------------------------------------------------
                                                                                  [INFO] BUILD SUCCESS
                                                                                  [INFO] ------------------------------------------------------------------------
                                                                                  [INFO] Total time:  1.339 s
                                                                                  [INFO] Finished at: 2022-01-27T13:06:01+02:00
                                                                                  [INFO] ------------------------------------------------------------------------
                                                                                  
                                                                                  Process finished with exit code 0
                                                                                  
                                                                                  

                                                                                  ANSWER

                                                                                  Answered 2022-Jan-31 at 10:43

                                                                                  As can be seen in the comments of the post, my problem is fixed, so I thought I'd collect a closing answer here to not leave the post pending.

                                                                                  The actual solution: As explained here nicely by @UninformedUser, the issue was that I had conflicting maven package versions in my dependencies. Bringing everything in sync with each other solved the issue.

                                                                                  Incidental solution: As I wrote in the comments above, specifically defining 3.3.0 for the maven-assembly-plugin happened to solve the issue. But this was only chance, as explained here by @Ignazio, just because the order of "assembling" things changed, overwriting the conflicting package.

                                                                                  Huge thanks to both for the help.

                                                                                  Source https://stackoverflow.com/questions/70854565

                                                                                  QUESTION

                                                                                  pyarrow reading parquet from S3 performance confusions
                                                                                  Asked 2022-Jan-26 at 19:16

                                                                                  I have a Parquet file in AWS S3. I would like to read it into a Pandas DataFrame. There are two ways for me to accomplish this.

                                                                                  1)
                                                                                  import pyarrow.parquet as pq
                                                                                  table = pq.read_table("s3://tpc-h-parquet/lineitem/part0.snappy.parquet") (takes 1 sec)
                                                                                  pandas_table = table.to_pandas() ( takes 1 sec !!! )
                                                                                  2)
                                                                                  import pandas as pd
                                                                                  table = pd.read_parquet("s3://tpc-h-parquet/lineitem/part0.snappy.parquet") (takes 2 sec)
                                                                                  

                                                                                  I suspect option 2 is really just doing option 1 under the hood anyways.

                                                                                  What is the fastest way for me to read a Parquet file into Pandas?

                                                                                  ANSWER

                                                                                  Answered 2022-Jan-26 at 19:16

                                                                                  You are correct. Option 2 is just option 1 under the hood.

                                                                                  What is the fastest way for me to read a Parquet file into Pandas?

                                                                                  Both option 1 and option 2 are probably good enough. However, if you are trying to shave off every bit you may need to go one layer deeper, depending on your pyarrow version. It turns out that Option 1 is actually also just a proxy, in this case to the datasets API:

                                                                                  import pyarrow.dataset as ds
                                                                                  dataset = ds.dataset("s3://tpc-h-parquet/lineitem/part0.snappy.parquet")
                                                                                  table = dataset.to_table(use_threads=True)
                                                                                  df = table.to_pandas()
                                                                                  

                                                                                  For pyarrow versions >= 4 and < 7 you can usually get slightly better performance on S3 using the asynchronous scanner:

                                                                                  import pyarrow.dataset as ds
                                                                                  dataset = ds.dataset("s3://tpc-h-parquet/lineitem/part0.snappy.parquet")
                                                                                  table = dataset.to_table(use_threads=True, use_async=True)
                                                                                  df = table.to_pandas()
                                                                                  

                                                                                  In pyarrow version 7 the asynchronous scanner is the default so you can once again simply use pd.read_parquet("s3://tpc-h-parquet/lineitem/part0.snappy.parquet")

                                                                                  Source https://stackoverflow.com/questions/70857825

                                                                                  QUESTION

                                                                                  Dask ParserError: Error tokenizing data when reading CSV
                                                                                  Asked 2022-Jan-19 at 17:11

                                                                                  I am getting the same error as this question, but the recommended solution of setting blocksize=None isn't solving the issue for me. I'm trying to convert the NYC taxi data from CSV to Parquet and this is the code I'm running:

                                                                                  ddf = dd.read_csv(
                                                                                      "s3://nyc-tlc/trip data/yellow_tripdata_2010-*.csv",
                                                                                      parse_dates=["pickup_datetime", "dropoff_datetime"],
                                                                                      blocksize=None,
                                                                                      dtype={
                                                                                          "tolls_amount": "float64",
                                                                                          "store_and_fwd_flag": "object",
                                                                                      },
                                                                                  )
                                                                                  
                                                                                  ddf.to_parquet(
                                                                                      "s3://coiled-datasets/nyc-tlc/2010",
                                                                                      engine="pyarrow",
                                                                                      compression="snappy",
                                                                                      write_metadata_file=False,
                                                                                  )
                                                                                  

                                                                                  Here's the error I'm getting:

                                                                                  "ParserError: Error tokenizing data. C error: Expected 18 fields in line 2958, saw 19".
                                                                                  

                                                                                  Adding blocksize=None helps sometimes, see here for example, and I'm not sure why it's not solving my issue.

                                                                                  Any suggestions on how to get past this issue?

                                                                                  This code works for the 2011 taxi data, so their must be something weird in the 2010 taxi data that's causing this issue.

                                                                                  ANSWER

                                                                                  Answered 2022-Jan-19 at 17:08

                                                                                  The raw file s3://nyc-tlc/trip data/yellow_tripdata_2010-02.csv contains an error (one too many commas). This is the offending line (middle) and its neighbours:

                                                                                  VTS,2010-02-16 08:02:00,2010-02-16 08:14:00,5,4.2999999999999998,-73.955112999999997,40.786718,1,,-73.924710000000005,40.841335000000001,CSH,11.699999999999999,0,0.5,0,0,12.199999999999999
                                                                                  CMT,2010-02-24 16:25:18,2010-02-24 16:52:14,1,12.4,-73.988956000000002,40.736567000000001,1,,,-73.861762999999996,40.768383999999998,CAS,29.300000000000001,1,0.5,0,4.5700000000000003,35.369999999999997
                                                                                  VTS,2010-02-16 07:58:00,2010-02-16 08:09:00,1,2.9700000000000002,-73.977469999999997,40.779359999999997,1,,-74.004427000000007,40.742137999999997,CRD,9.3000000000000007,0,0.5,1.5,0,11.300000000000001
                                                                                  

                                                                                  Some of the options are:

                                                                                  • on_bad_lines kwarg to pandas can be set to warn or skip (so this should be also possible with dask.dataframe;

                                                                                  • fix the raw file (knowing where the error is) with something like sed (assuming you can modify the raw files) or on the fly by reading the file line by line.

                                                                                  Source https://stackoverflow.com/questions/70763876

                                                                                  Community Discussions, Code Snippets contain sources that include Stack Exchange Network

                                                                                  Vulnerabilities

                                                                                  No vulnerabilities reported

                                                                                  Install snappy

                                                                                  You can download it from GitHub.

                                                                                  Support

                                                                                  For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
                                                                                  Find more information at:
                                                                                  Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
                                                                                  Find more libraries
                                                                                  Explore Kits - Develop, implement, customize Projects, Custom Functions and Applications with kandi kits​
                                                                                  Save this library and start creating your kit
                                                                                  CLONE
                                                                                • HTTPS

                                                                                  https://github.com/golang/snappy.git

                                                                                • CLI

                                                                                  gh repo clone golang/snappy

                                                                                • sshUrl

                                                                                  git@github.com:golang/snappy.git

                                                                                • Share this Page

                                                                                  share link
                                                                                  Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
                                                                                  Find more libraries
                                                                                  Explore Kits - Develop, implement, customize Projects, Custom Functions and Applications with kandi kits​
                                                                                  Save this library and start creating your kit