Configuration
Configuration of Parquet can be done using the setConf method on SparkSession or by running SET key=value commands using SQL.
Property Name Default Meaning
spark.sql.parquet.binaryAsString false Some other Parquet-producing systems, in particular Impala, Hive, and older versions of Spark SQL, do not differentiate between binary data and strings when writing out the Parquet schema. This flag tells Spark SQL to interpret binary data as a string to provide compatibility with these systems.
spark.sql.parquet.int96AsTimestamp true Some Parquet-producing systems, in particular Impala and Hive, store Timestamp into INT96. This flag tells Spark SQL to interpret INT96 data as a timestamp to provide compatibility with these systems.
spark.sql.parquet.cacheMetadata true Turns on caching of Parquet schema metadata. Can speed up querying of static data.
spark.sql.parquet.compression.codec snappy Sets the compression codec use when writing Parquet files. Acceptable values include: uncompressed, snappy, gzip, lzo.
spark.sql.parquet.filterPushdown true Enables Parquet filter push-down optimization when set to true.
spark.sql.hive.convertMetastoreParquet true When set to false, Spark SQL will use the Hive SerDe for parquet tables instead of the built in support.
spark.sql.parquet.mergeSchema false
When true, the Parquet data source merges schemas collected from all data files, otherwise the schema is picked from the summary file or a random data file if no summary file is available.
spark.sql.optimizer.metadataOnly true
When true, enable the metadata-only query optimization that use the table's metadata to produce the partition columns instead of table scans. It applies when all the columns scanned are partition columns and the query has an aggregate operator that satisfies distinct semantics.
分享到:
相关推荐
java SWT 界面开发环境配置常见错误发生时,可能用到的资源, org.eclipse.core.commands_3.6.1.选用了64位机器环境下eclipse开发用到的jar包。
Trusted Platform Module Library Part 3: Commands This TPM 2.0 Part 3 of the Trusted Platform Module Library specification contains the definitions of the TPM commands. These commands make use of the ...
<uses-permission android:name="android.permission.ACCESS_LOCATION_EXTRA_COMMANDS" /> <uses-permission android:name="android.permission.INTERNET" /> <uses-permission android:name="android.permission...
wiley.ubuntu.linux.toolbox.1000.plus.commands.for.ubuntu.and.debian.power.users.nov.2007.pdf
Citrix.XenApp.Commands.Install.zip
BOOK6=HLP\DBG51.CHM("uVision2 Debug Commands",GEN) BOOK7=HLP\ISD51.CHM("ISD51 In System Debugger",GEN) BOOK8=HLP\FlashMon51.CHM("Flash Monitor",GEN) BOOK9=MON390\MON390.HTM("MON390: Dallas Contiguous ...
Sybex.Todd.Lammles.c.IOS.Commands.Survival.Guide.Nov.2007.pdf
Unix.Shell.Commands.Card.pdf
Sams.MySQL.Phrasebook.Essential.Code.and.Commands.Mar.2006.chm
python库。 资源全名:rt.commands-0.1.zip
Wiley.SUSE.Linux.Toolbox.1000.plus.Commands.for.openSUSE.and.SUSE.Linux.Enterprise.Dec.2007.pdf
使用想服务器上传jsp执行windows或者linux命令,来获取服务器的资源情况。 Commands with JSP.sjp
eetop.cn_Synthesis tool commands_2022.03.pdf
资源来自pypi官网。 资源全名:datanommer.commands-0.3.0.tar.gz
Juniper-commands-v2.xls
Octave-Commands-源码.rar
资源来自pypi官网。 资源全名:dodo_commands-0.14.3.tar.gz
资源来自pypi官网。 资源全名:django-schedule-commands-2020.12.29.tar.gz
资源来自pypi官网。 资源全名:django-schedule-commands-2020.12.24.tar.gz
资源来自pypi官网。 资源全名:dodo_commands-0.10.3.tar.gz