--- results/org.apache.nifi/nifi/2.0.0-M4/oss-rebuild-improved-2/reference/nifi-parquet-nar-2.0.0-M4-nar-extension-manifest.xml:extension-manifest.xml 2025-04-06 12:19:33.562656751 +0000
+++ results/org.apache.nifi/nifi/2.0.0-M4/oss-rebuild-improved-2/rebuild/nifi-parquet-nar-2.0.0-M4-nar-extension-manifest.xml:extension-manifest.xml 2025-04-06 12:19:40.778543421 +0000
@@ -1,5 +1,5 @@
org.apache.nifinifi-parquet-nar2.0.0-M4org.apache.nifinifi-hadoop-libraries-nar2.0.0-M42.0.0-M4nifi-2.0.0-M4-RC1UNKNOWN19c5be0org.apache.nifi.processors.parquet.CalculateParquetOffsetsPROCESSORThe processor generates N flow files from the input, and adds attributes with the offsets required to read the group of rows in the FlowFile's content. Can be used to increase the overall efficiency of processing extremely large Parquet files.parquetsplitpartitionbreak apartefficient processingload balanceclusterRecords Per SplitRecords Per SplitSpecifies how many records should be covered in each FlowFiletruefalsetrueFLOWFILE_ATTRIBUTESfalsefalseZero Content OutputZero Content OutputWhether to do, or do not copy the content of input FlowFile.falsetruetruefalsefalsetruefalsefalseNONEfalsefalsesuccessFlowFiles, with special attributes that represent a chunk of the input file.falserecord.offsetGets the index of first record in the input.record.countGets the number of records in the input.parquet.file.range.startOffsetGets the start offset of the selected row group in the parquet file.parquet.file.range.endOffsetGets the end offset of the selected row group in the parquet file.record.offsetSets the index of first record of the parquet file.record.countSets the number of records in the parquet file.trueINPUT_REQUIREDorg.apache.nifi.processors.parquet.CalculateParquetRowGroupOffsetsPROCESSORThe processor generates one FlowFile from each Row Group of the input, and adds attributes with the offsets required to read the group of rows in the FlowFile's content. Can be used to increase the overall efficiency of processing extremely large Parquet files.parquetsplitpartitionbreak apartefficient processingload balanceclusterZero Content OutputZero Content OutputWhether to do, or do not copy the content of input FlowFile.falsetruetruefalsefalsetruefalsefalseNONEfalsefalsesuccessFlowFiles, with special attributes that represent a chunk of the input file.falseparquet.file.range.startOffsetSets the start offset of the selected row group in the parquet file.parquet.file.range.endOffsetSets the end offset of the selected row group in the parquet file.record.countSets the count of records in the selected row group.trueINPUT_REQUIREDorg.apache.nifi.processors.parquet.ConvertAvroToParquetPROCESSORConverts Avro records into Parquet file format. The incoming FlowFile should be a valid avro file. If an incoming FlowFile does not contain any records, an empty parquet file is the output. NOTE: Many Avro datatypes (collections, primitives, and unions of primitives, e.g.) can be converted to parquet, but unions of collections and other complex datatypes may not be able to be converted to Parquet.avroparquetconvertcompression-typeCompression TypeThe type of compression for the file being written.UNCOMPRESSEDUNCOMPRESSEDUNCOMPRESSEDSNAPPYSNAPPYGZIPGZIPLZOLZOBROTLIBROTLILZ4LZ4ZSTDZSTDLZ4_RAWLZ4_RAWtruefalsefalseNONEfalsefalserow-group-sizeRow Group SizeThe row group size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsepage-sizePage SizeThe page size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsedictionary-page-sizeDictionary Page SizeThe dictionary page size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsemax-padding-sizeMax Padding SizeThe maximum amount of padding that will be used to align row groups with blocks in the underlying filesystem. If the underlying filesystem is not a block filesystem like HDFS, this has no effect. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseenable-dictionary-encodingEnable Dictionary EncodingSpecifies whether dictionary encoding should be enabled for the Parquet writertruetruefalsefalsefalsefalsefalseNONEfalsefalseenable-validationEnable ValidationSpecifies whether validation should be enabled for the Parquet writertruetruefalsefalsefalsefalsefalseNONEfalsefalsewriter-versionWriter VersionSpecifies the version used by Parquet writerPARQUET_1_0PARQUET_1_0PARQUET_2_0PARQUET_2_0falsefalsefalseNONEfalsefalsesuccessParquet file that was converted successfully from AvrofalsefailureAvro content that could not be processedfalsefilenameSets the filename to the existing filename with the extension replaced by / added to by .parquetrecord.countSets the number of records in the parquet file.INPUT_REQUIREDorg.apache.nifi.processors.parquet.FetchParquetPROCESSORReads from a given Parquet file and writes records to the content of the flow file using the selected record writer. The original Parquet file will remain unchanged, and the content of the flow file will be replaced with records of the selected type. This processor can be used with ListHDFS or ListFile to obtain a listing of files to fetch.parquethadoopHDFSgetingestfetchsourcerecordHadoop Configuration ResourcesHadoop Configuration ResourcesA file or comma separated list of files which contains the Hadoop file system configuration. Without this, Hadoop will search the classpath for a 'core-site.xml' and 'hdfs-site.xml' file or will revert to a default configuration. To use swebhdfs, see 'Additional Details' section of PutHDFS's documentation.falsefalsetrueENVIRONMENTfalsefalseMULTIPLEFILEkerberos-credentials-serviceKerberos Credentials ServiceSpecifies the Kerberos Credentials Controller Service that should be used for authenticating with Kerberosorg.apache.nifi.kerberos.KerberosCredentialsServiceorg.apache.nifinifi-standard-services-api-nar2.0.0-M4falsefalsefalseNONEfalsefalsekerberos-user-serviceKerberos User ServiceSpecifies the Kerberos User Controller Service that should be used for authenticating with Kerberosorg.apache.nifi.kerberos.KerberosUserServiceorg.apache.nifinifi-standard-services-api-nar2.0.0-M4falsefalsefalseNONEfalsefalseKerberos PrincipalKerberos PrincipalKerberos principal to authenticate as. Requires nifi.kerberos.krb5.file to be set in your nifi.propertiesfalsefalsetrueENVIRONMENTfalsefalseKerberos KeytabKerberos KeytabKerberos keytab associated with the principal. Requires nifi.kerberos.krb5.file to be set in your nifi.propertiesfalsefalsetrueENVIRONMENTfalsefalseSINGLEFILEKerberos PasswordKerberos PasswordKerberos password associated with the principal.falsetruefalseNONEfalsefalseKerberos Relogin PeriodKerberos Relogin PeriodPeriod of time which should pass before attempting a kerberos relogin.
-This property has been deprecated, and has no effect on processing. Relogins now occur automatically.4 hoursfalsefalsetrueENVIRONMENTfalsefalseAdditional Classpath ResourcesAdditional Classpath ResourcesA comma-separated list of paths to files and/or directories that will be added to the classpath and used for loading native libraries. When specifying a directory, all files with in the directory will be added to the classpath, but further sub-directories will not be included.falsefalsefalseNONEtruefalseMULTIPLEDIRECTORYFILEfilenameFilenameThe name of the file to retrieve${path}/${filename}truefalsetrueFLOWFILE_ATTRIBUTESfalsefalserecord-writerRecord WriterThe service for writing records to the FlowFile contentorg.apache.nifi.serialization.RecordSetWriterFactoryorg.apache.nifinifi-standard-services-api-nar2.0.0-M4truefalsefalseNONEfalsefalseretryFlowFiles will be routed to this relationship if the content of the file cannot be retrieved, but might be able to be in the future if tried again. This generally indicates that the Fetch should be tried again.falsesuccessFlowFiles will be routed to this relationship once they have been updated with the content of the filefalsefailureFlowFiles will be routed to this relationship if the content of the file cannot be retrieved and trying again will likely not be helpful. This would occur, for instance, if the file is not found or if there is a permissions issuefalserecord.offsetGets the index of first record in the input.record.countGets the number of records in the input.fetch.failure.reasonWhen a FlowFile is routed to 'failure', this attribute is added indicating why the file could not be fetched from the given filesystem.record.countThe number of records in the resulting flow filehadoop.file.urlThe hadoop url for the file is stored in this attribute.truetrue100 ms30 secWARNread distributed filesystemProvides operator the ability to retrieve any file that NiFi has access to in HDFS or the local filesystem.INPUT_REQUIREDorg.apache.nifi.processors.parquet.PutParquetorg.apache.nifi.processors.parquet.PutParquetPROCESSORReads records from an incoming FlowFile using the provided Record Reader, and writes those records to a Parquet file. The schema for the Parquet file must be provided in the processor properties. This processor will first write a temporary dot file and upon successfully writing every record to the dot file, it will rename the dot file to it's final name. If the dot file cannot be renamed, the rename operation will be attempted up to 10 times, and if still not successful, the dot file will be deleted and the flow file will be routed to failure. If any error occurs while reading records from the input, or writing records to the output, the entire dot file will be removed and the flow file will be routed to failure or retry, depending on the error.putparquethadoopHDFSfilesystemrecordHadoop Configuration ResourcesHadoop Configuration ResourcesA file or comma separated list of files which contains the Hadoop file system configuration. Without this, Hadoop will search the classpath for a 'core-site.xml' and 'hdfs-site.xml' file or will revert to a default configuration. To use swebhdfs, see 'Additional Details' section of PutHDFS's documentation.falsefalsetrueENVIRONMENTfalsefalseMULTIPLEFILEkerberos-credentials-serviceKerberos Credentials ServiceSpecifies the Kerberos Credentials Controller Service that should be used for authenticating with Kerberosorg.apache.nifi.kerberos.KerberosCredentialsServiceorg.apache.nifinifi-standard-services-api-nar2.0.0-M4falsefalsefalseNONEfalsefalsekerberos-user-serviceKerberos User ServiceSpecifies the Kerberos User Controller Service that should be used for authenticating with Kerberosorg.apache.nifi.kerberos.KerberosUserServiceorg.apache.nifinifi-standard-services-api-nar2.0.0-M4falsefalsefalseNONEfalsefalseKerberos PrincipalKerberos PrincipalKerberos principal to authenticate as. Requires nifi.kerberos.krb5.file to be set in your nifi.propertiesfalsefalsetrueENVIRONMENTfalsefalseKerberos KeytabKerberos KeytabKerberos keytab associated with the principal. Requires nifi.kerberos.krb5.file to be set in your nifi.propertiesfalsefalsetrueENVIRONMENTfalsefalseSINGLEFILEKerberos PasswordKerberos PasswordKerberos password associated with the principal.falsetruefalseNONEfalsefalseKerberos Relogin PeriodKerberos Relogin PeriodPeriod of time which should pass before attempting a kerberos relogin.
+This property has been deprecated, and has no effect on processing. Relogins now occur automatically.4 hoursfalsefalsetrueENVIRONMENTfalsefalseAdditional Classpath ResourcesAdditional Classpath ResourcesA comma-separated list of paths to files and/or directories that will be added to the classpath and used for loading native libraries. When specifying a directory, all files with in the directory will be added to the classpath, but further sub-directories will not be included.falsefalsefalseNONEtruefalseMULTIPLEFILEDIRECTORYfilenameFilenameThe name of the file to retrieve${path}/${filename}truefalsetrueFLOWFILE_ATTRIBUTESfalsefalserecord-writerRecord WriterThe service for writing records to the FlowFile contentorg.apache.nifi.serialization.RecordSetWriterFactoryorg.apache.nifinifi-standard-services-api-nar2.0.0-M4truefalsefalseNONEfalsefalseretryFlowFiles will be routed to this relationship if the content of the file cannot be retrieved, but might be able to be in the future if tried again. This generally indicates that the Fetch should be tried again.falsesuccessFlowFiles will be routed to this relationship once they have been updated with the content of the filefalsefailureFlowFiles will be routed to this relationship if the content of the file cannot be retrieved and trying again will likely not be helpful. This would occur, for instance, if the file is not found or if there is a permissions issuefalserecord.offsetGets the index of first record in the input.record.countGets the number of records in the input.fetch.failure.reasonWhen a FlowFile is routed to 'failure', this attribute is added indicating why the file could not be fetched from the given filesystem.record.countThe number of records in the resulting flow filehadoop.file.urlThe hadoop url for the file is stored in this attribute.truetrue100 ms30 secWARNread distributed filesystemProvides operator the ability to retrieve any file that NiFi has access to in HDFS or the local filesystem.INPUT_REQUIREDorg.apache.nifi.processors.parquet.PutParquetorg.apache.nifi.processors.parquet.PutParquetPROCESSORReads records from an incoming FlowFile using the provided Record Reader, and writes those records to a Parquet file. The schema for the Parquet file must be provided in the processor properties. This processor will first write a temporary dot file and upon successfully writing every record to the dot file, it will rename the dot file to it's final name. If the dot file cannot be renamed, the rename operation will be attempted up to 10 times, and if still not successful, the dot file will be deleted and the flow file will be routed to failure. If any error occurs while reading records from the input, or writing records to the output, the entire dot file will be removed and the flow file will be routed to failure or retry, depending on the error.putparquethadoopHDFSfilesystemrecordHadoop Configuration ResourcesHadoop Configuration ResourcesA file or comma separated list of files which contains the Hadoop file system configuration. Without this, Hadoop will search the classpath for a 'core-site.xml' and 'hdfs-site.xml' file or will revert to a default configuration. To use swebhdfs, see 'Additional Details' section of PutHDFS's documentation.falsefalsetrueENVIRONMENTfalsefalseMULTIPLEFILEkerberos-credentials-serviceKerberos Credentials ServiceSpecifies the Kerberos Credentials Controller Service that should be used for authenticating with Kerberosorg.apache.nifi.kerberos.KerberosCredentialsServiceorg.apache.nifinifi-standard-services-api-nar2.0.0-M4falsefalsefalseNONEfalsefalsekerberos-user-serviceKerberos User ServiceSpecifies the Kerberos User Controller Service that should be used for authenticating with Kerberosorg.apache.nifi.kerberos.KerberosUserServiceorg.apache.nifinifi-standard-services-api-nar2.0.0-M4falsefalsefalseNONEfalsefalseKerberos PrincipalKerberos PrincipalKerberos principal to authenticate as. Requires nifi.kerberos.krb5.file to be set in your nifi.propertiesfalsefalsetrueENVIRONMENTfalsefalseKerberos KeytabKerberos KeytabKerberos keytab associated with the principal. Requires nifi.kerberos.krb5.file to be set in your nifi.propertiesfalsefalsetrueENVIRONMENTfalsefalseSINGLEFILEKerberos PasswordKerberos PasswordKerberos password associated with the principal.falsetruefalseNONEfalsefalseKerberos Relogin PeriodKerberos Relogin PeriodPeriod of time which should pass before attempting a kerberos relogin.
-This property has been deprecated, and has no effect on processing. Relogins now occur automatically.4 hoursfalsefalsetrueENVIRONMENTfalsefalseAdditional Classpath ResourcesAdditional Classpath ResourcesA comma-separated list of paths to files and/or directories that will be added to the classpath and used for loading native libraries. When specifying a directory, all files with in the directory will be added to the classpath, but further sub-directories will not be included.falsefalsefalseNONEtruefalseMULTIPLEDIRECTORYFILErecord-readerRecord ReaderThe service for reading records from incoming flow files.org.apache.nifi.serialization.RecordReaderFactoryorg.apache.nifinifi-standard-services-api-nar2.0.0-M4truefalsefalseNONEfalsefalseDirectoryDirectoryThe parent directory to which files should be written. Will be created if it doesn't exist.truefalsetrueFLOWFILE_ATTRIBUTESfalsefalsecompression-typeCompression TypeThe type of compression for the file being written.UNCOMPRESSEDUNCOMPRESSEDUNCOMPRESSEDSNAPPYSNAPPYGZIPGZIPLZOLZOBROTLIBROTLILZ4LZ4ZSTDZSTDLZ4_RAWLZ4_RAWtruefalsefalseNONEfalsefalseoverwriteOverwrite FilesWhether or not to overwrite existing files in the same directory with the same name. When set to false, flow files will be routed to failure when a file exists in the same directory with the same name.falsetruetruefalsefalsetruefalsefalseNONEfalsefalsepermissions-umaskPermissions umaskA umask represented as an octal number which determines the permissions of files written to HDFS. This overrides the Hadoop Configuration dfs.umaskmodefalsefalsefalseNONEfalsefalseremote-groupRemote GroupChanges the group of the HDFS file to this value after it is written. This only works if NiFi is running as a user that has HDFS super user privilege to change groupfalsefalsefalseNONEfalsefalseremote-ownerRemote OwnerChanges the owner of the HDFS file to this value after it is written. This only works if NiFi is running as a user that has HDFS super user privilege to change ownerfalsefalsefalseNONEfalsefalserow-group-sizeRow Group SizeThe row group size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsepage-sizePage SizeThe page size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsedictionary-page-sizeDictionary Page SizeThe dictionary page size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsemax-padding-sizeMax Padding SizeThe maximum amount of padding that will be used to align row groups with blocks in the underlying filesystem. If the underlying filesystem is not a block filesystem like HDFS, this has no effect. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseenable-dictionary-encodingEnable Dictionary EncodingSpecifies whether dictionary encoding should be enabled for the Parquet writertruetruefalsefalsefalsefalsefalseNONEfalsefalseenable-validationEnable ValidationSpecifies whether validation should be enabled for the Parquet writertruetruefalsefalsefalsefalsefalseNONEfalsefalsewriter-versionWriter VersionSpecifies the version used by Parquet writerPARQUET_1_0PARQUET_1_0PARQUET_2_0PARQUET_2_0falsefalsefalseNONEfalsefalseavro-write-old-list-structureAvro Write Old List StructureSpecifies the value for 'parquet.avro.write-old-list-structure' in the underlying Parquet librarytruetruetruefalsefalsetruefalsefalseNONEfalsefalseavro-add-list-element-recordsAvro Add List Element RecordsSpecifies the value for 'parquet.avro.add-list-element-records' in the underlying Parquet librarytruetruetruefalsefalsetruefalsefalseNONEfalsefalseremove-crc-filesRemove CRC FilesSpecifies whether the corresponding CRC file should be deleted upon successfully writing a Parquet filefalsetruetruefalsefalsefalsefalsefalseNONEfalsefalseretryFlow Files that could not be processed due to issues that can be retried are transferred to this relationshipfalsesuccessFlow Files that have been successfully processed are transferred to this relationshipfalsefailureFlow Files that could not be processed due to issue that cannot be retried are transferred to this relationshipfalsefilenameThe name of the file to write comes from the value of this attribute.filenameThe name of the file is stored in this attribute.absolute.hdfs.pathThe absolute path to the file is stored in this attribute.hadoop.file.urlThe hadoop url for the file is stored in this attribute.record.countThe number of records written to the Parquet filetrue100 ms30 secWARNwrite distributed filesystemProvides operator the ability to write any file that NiFi has access to in HDFS or the local filesystem.INPUT_REQUIREDorg.apache.nifi.parquet.ParquetReaderCONTROLLER_SERVICEParses Parquet data and returns each Parquet record as a separate Record object. The schema will come from the Parquet data itself.parquetparserecordrowreaderavro-read-compatibilityAvro Read CompatibilitySpecifies the value for 'parquet.avro.compatible' in the underlying Parquet librarytruetruetruefalsefalsetruefalsefalseNONEfalsefalseorg.apache.nifi.serialization.RecordReaderFactoryorg.apache.nifinifi-standard-services-api-nar2.0.0-M4org.apache.nifi.parquet.ParquetRecordSetWriterCONTROLLER_SERVICEWrites the contents of a RecordSet in Parquet format.parquetresultsetwriterserializerrecordrecordsetrowSchema Write StrategySchema Write StrategySpecifies how the schema for a Record should be added to the data.no-schemaDo Not Write Schemano-schemaDo not add any schema-related information to the FlowFile.Set 'schema.name' Attributeschema-nameThe FlowFile will be given an attribute named 'schema.name' and this attribute will indicate the name of the schema in the Schema Registry. Note that ifthe schema for a record is not obtained from a Schema Registry, then no attribute will be added.Set 'avro.schema' Attributefull-schema-attributeThe FlowFile will be given an attribute named 'avro.schema' and this attribute will contain the Avro Schema that describes the records in the FlowFile. The contents of the FlowFile need not be Avro, but the text of the schema will be used.Schema Reference Writerschema-reference-writerThe schema reference information will be written through a configured Schema Reference Writer service implementation.truefalsefalseNONEfalsefalseschema-cacheSchema CacheSpecifies a Schema Cache to add the Record Schema to so that Record Readers can quickly lookup the schema.org.apache.nifi.serialization.RecordSchemaCacheServiceorg.apache.nifinifi-standard-services-api-nar2.0.0-M4falsefalsefalseNONEfalsefalseSchema Reference WriterSchema Reference WriterService implementation responsible for writing FlowFile attributes or content header with Schema reference informationorg.apache.nifi.schemaregistry.services.SchemaReferenceWriterorg.apache.nifinifi-standard-services-api-nar2.0.0-M4truefalsefalseNONEfalsefalseSchema Write StrategySchema Write Strategyschema-reference-writerschema-access-strategySchema Access StrategySpecifies how to obtain the schema that is to be used for interpreting the data.inherit-record-schemaInherit Record Schemainherit-record-schemaThe schema used to write records will be the same schema that was given to the Record when the Record was created.Use 'Schema Name' Propertyschema-nameThe name of the Schema to use is specified by the 'Schema Name' Property. The value of this property is used to lookup the Schema in the configured Schema Registry service.Use 'Schema Text' Propertyschema-text-propertyThe text of the Schema itself is specified by the 'Schema Text' Property. The value of this property must be a valid Avro Schema. If Expression Language is used, the value of the 'Schema Text' property must be valid after substituting the expressions.truefalsefalseNONEfalsefalseschema-registrySchema RegistrySpecifies the Controller Service to use for the Schema Registryorg.apache.nifi.schemaregistry.services.SchemaRegistryorg.apache.nifinifi-standard-services-api-nar2.0.0-M4falsefalsefalseNONEfalsefalseschema-access-strategySchema Access Strategyschema-reference-readerschema-nameschema-nameSchema NameSpecifies the name of the schema to lookup in the Schema Registry property${schema.name}falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseschema-access-strategySchema Access Strategyschema-nameschema-versionSchema VersionSpecifies the version of the schema to lookup in the Schema Registry. If not specified then the latest version of the schema will be retrieved.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseschema-access-strategySchema Access Strategyschema-nameschema-branchSchema BranchSpecifies the name of the branch to use when looking up the schema in the Schema Registry property. If the chosen Schema Registry does not support branching, this value will be ignored.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseschema-access-strategySchema Access Strategyschema-nameschema-textSchema TextThe text of an Avro-formatted Schema${avro.schema}falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseschema-access-strategySchema Access Strategyschema-text-propertyschema-reference-readerSchema Reference ReaderService implementation responsible for reading FlowFile attributes or content to determine the Schema Reference Identifierorg.apache.nifi.schemaregistry.services.SchemaReferenceReaderorg.apache.nifinifi-standard-services-api-nar2.0.0-M4truefalsefalseNONEfalsefalseschema-access-strategySchema Access Strategyschema-reference-readercache-sizeCache SizeSpecifies how many Schemas should be cached1000truefalsefalseNONEfalsefalsecompression-typeCompression TypeThe type of compression for the file being written.UNCOMPRESSEDUNCOMPRESSEDUNCOMPRESSEDSNAPPYSNAPPYGZIPGZIPLZOLZOBROTLIBROTLILZ4LZ4ZSTDZSTDLZ4_RAWLZ4_RAWtruefalsefalseNONEfalsefalserow-group-sizeRow Group SizeThe row group size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsepage-sizePage SizeThe page size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsedictionary-page-sizeDictionary Page SizeThe dictionary page size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsemax-padding-sizeMax Padding SizeThe maximum amount of padding that will be used to align row groups with blocks in the underlying filesystem. If the underlying filesystem is not a block filesystem like HDFS, this has no effect. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseenable-dictionary-encodingEnable Dictionary EncodingSpecifies whether dictionary encoding should be enabled for the Parquet writertruetruefalsefalsefalsefalsefalseNONEfalsefalseenable-validationEnable ValidationSpecifies whether validation should be enabled for the Parquet writertruetruefalsefalsefalsefalsefalseNONEfalsefalsewriter-versionWriter VersionSpecifies the version used by Parquet writerPARQUET_1_0PARQUET_1_0PARQUET_2_0PARQUET_2_0falsefalsefalseNONEfalsefalseavro-write-old-list-structureAvro Write Old List StructureSpecifies the value for 'parquet.avro.write-old-list-structure' in the underlying Parquet librarytruetruetruefalsefalsetruefalsefalseNONEfalsefalseavro-add-list-element-recordsAvro Add List Element RecordsSpecifies the value for 'parquet.avro.add-list-element-records' in the underlying Parquet librarytruetruetruefalsefalsetruefalsefalseNONEfalsefalseint96-fieldsINT96 FieldsList of fields with full path that should be treated as INT96 timestamps.falsefalsefalseNONEfalsefalseorg.apache.nifi.serialization.RecordSetWriterFactoryorg.apache.nifinifi-standard-services-api-nar2.0.0-M4
\ No newline at end of file
+This property has been deprecated, and has no effect on processing. Relogins now occur automatically.4 hoursfalsefalsetrueENVIRONMENTfalsefalseAdditional Classpath ResourcesAdditional Classpath ResourcesA comma-separated list of paths to files and/or directories that will be added to the classpath and used for loading native libraries. When specifying a directory, all files with in the directory will be added to the classpath, but further sub-directories will not be included.falsefalsefalseNONEtruefalseMULTIPLEFILEDIRECTORYrecord-readerRecord ReaderThe service for reading records from incoming flow files.org.apache.nifi.serialization.RecordReaderFactoryorg.apache.nifinifi-standard-services-api-nar2.0.0-M4truefalsefalseNONEfalsefalseDirectoryDirectoryThe parent directory to which files should be written. Will be created if it doesn't exist.truefalsetrueFLOWFILE_ATTRIBUTESfalsefalsecompression-typeCompression TypeThe type of compression for the file being written.UNCOMPRESSEDUNCOMPRESSEDUNCOMPRESSEDSNAPPYSNAPPYGZIPGZIPLZOLZOBROTLIBROTLILZ4LZ4ZSTDZSTDLZ4_RAWLZ4_RAWtruefalsefalseNONEfalsefalseoverwriteOverwrite FilesWhether or not to overwrite existing files in the same directory with the same name. When set to false, flow files will be routed to failure when a file exists in the same directory with the same name.falsetruetruefalsefalsetruefalsefalseNONEfalsefalsepermissions-umaskPermissions umaskA umask represented as an octal number which determines the permissions of files written to HDFS. This overrides the Hadoop Configuration dfs.umaskmodefalsefalsefalseNONEfalsefalseremote-groupRemote GroupChanges the group of the HDFS file to this value after it is written. This only works if NiFi is running as a user that has HDFS super user privilege to change groupfalsefalsefalseNONEfalsefalseremote-ownerRemote OwnerChanges the owner of the HDFS file to this value after it is written. This only works if NiFi is running as a user that has HDFS super user privilege to change ownerfalsefalsefalseNONEfalsefalserow-group-sizeRow Group SizeThe row group size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsepage-sizePage SizeThe page size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsedictionary-page-sizeDictionary Page SizeThe dictionary page size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsemax-padding-sizeMax Padding SizeThe maximum amount of padding that will be used to align row groups with blocks in the underlying filesystem. If the underlying filesystem is not a block filesystem like HDFS, this has no effect. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseenable-dictionary-encodingEnable Dictionary EncodingSpecifies whether dictionary encoding should be enabled for the Parquet writertruetruefalsefalsefalsefalsefalseNONEfalsefalseenable-validationEnable ValidationSpecifies whether validation should be enabled for the Parquet writertruetruefalsefalsefalsefalsefalseNONEfalsefalsewriter-versionWriter VersionSpecifies the version used by Parquet writerPARQUET_1_0PARQUET_1_0PARQUET_2_0PARQUET_2_0falsefalsefalseNONEfalsefalseavro-write-old-list-structureAvro Write Old List StructureSpecifies the value for 'parquet.avro.write-old-list-structure' in the underlying Parquet librarytruetruetruefalsefalsetruefalsefalseNONEfalsefalseavro-add-list-element-recordsAvro Add List Element RecordsSpecifies the value for 'parquet.avro.add-list-element-records' in the underlying Parquet librarytruetruetruefalsefalsetruefalsefalseNONEfalsefalseremove-crc-filesRemove CRC FilesSpecifies whether the corresponding CRC file should be deleted upon successfully writing a Parquet filefalsetruetruefalsefalsefalsefalsefalseNONEfalsefalseretryFlow Files that could not be processed due to issues that can be retried are transferred to this relationshipfalsesuccessFlow Files that have been successfully processed are transferred to this relationshipfalsefailureFlow Files that could not be processed due to issue that cannot be retried are transferred to this relationshipfalsefilenameThe name of the file to write comes from the value of this attribute.filenameThe name of the file is stored in this attribute.absolute.hdfs.pathThe absolute path to the file is stored in this attribute.hadoop.file.urlThe hadoop url for the file is stored in this attribute.record.countThe number of records written to the Parquet filetrue100 ms30 secWARNwrite distributed filesystemProvides operator the ability to write any file that NiFi has access to in HDFS or the local filesystem.INPUT_REQUIREDorg.apache.nifi.parquet.ParquetReaderCONTROLLER_SERVICEParses Parquet data and returns each Parquet record as a separate Record object. The schema will come from the Parquet data itself.parquetparserecordrowreaderavro-read-compatibilityAvro Read CompatibilitySpecifies the value for 'parquet.avro.compatible' in the underlying Parquet librarytruetruetruefalsefalsetruefalsefalseNONEfalsefalseorg.apache.nifi.serialization.RecordReaderFactoryorg.apache.nifinifi-standard-services-api-nar2.0.0-M4org.apache.nifi.parquet.ParquetRecordSetWriterCONTROLLER_SERVICEWrites the contents of a RecordSet in Parquet format.parquetresultsetwriterserializerrecordrecordsetrowSchema Write StrategySchema Write StrategySpecifies how the schema for a Record should be added to the data.no-schemaDo Not Write Schemano-schemaDo not add any schema-related information to the FlowFile.Set 'schema.name' Attributeschema-nameThe FlowFile will be given an attribute named 'schema.name' and this attribute will indicate the name of the schema in the Schema Registry. Note that ifthe schema for a record is not obtained from a Schema Registry, then no attribute will be added.Set 'avro.schema' Attributefull-schema-attributeThe FlowFile will be given an attribute named 'avro.schema' and this attribute will contain the Avro Schema that describes the records in the FlowFile. The contents of the FlowFile need not be Avro, but the text of the schema will be used.Schema Reference Writerschema-reference-writerThe schema reference information will be written through a configured Schema Reference Writer service implementation.truefalsefalseNONEfalsefalseschema-cacheSchema CacheSpecifies a Schema Cache to add the Record Schema to so that Record Readers can quickly lookup the schema.org.apache.nifi.serialization.RecordSchemaCacheServiceorg.apache.nifinifi-standard-services-api-nar2.0.0-M4falsefalsefalseNONEfalsefalseSchema Reference WriterSchema Reference WriterService implementation responsible for writing FlowFile attributes or content header with Schema reference informationorg.apache.nifi.schemaregistry.services.SchemaReferenceWriterorg.apache.nifinifi-standard-services-api-nar2.0.0-M4truefalsefalseNONEfalsefalseSchema Write StrategySchema Write Strategyschema-reference-writerschema-access-strategySchema Access StrategySpecifies how to obtain the schema that is to be used for interpreting the data.inherit-record-schemaInherit Record Schemainherit-record-schemaThe schema used to write records will be the same schema that was given to the Record when the Record was created.Use 'Schema Name' Propertyschema-nameThe name of the Schema to use is specified by the 'Schema Name' Property. The value of this property is used to lookup the Schema in the configured Schema Registry service.Use 'Schema Text' Propertyschema-text-propertyThe text of the Schema itself is specified by the 'Schema Text' Property. The value of this property must be a valid Avro Schema. If Expression Language is used, the value of the 'Schema Text' property must be valid after substituting the expressions.truefalsefalseNONEfalsefalseschema-registrySchema RegistrySpecifies the Controller Service to use for the Schema Registryorg.apache.nifi.schemaregistry.services.SchemaRegistryorg.apache.nifinifi-standard-services-api-nar2.0.0-M4falsefalsefalseNONEfalsefalseschema-access-strategySchema Access Strategyschema-reference-readerschema-nameschema-nameSchema NameSpecifies the name of the schema to lookup in the Schema Registry property${schema.name}falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseschema-access-strategySchema Access Strategyschema-nameschema-versionSchema VersionSpecifies the version of the schema to lookup in the Schema Registry. If not specified then the latest version of the schema will be retrieved.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseschema-access-strategySchema Access Strategyschema-nameschema-branchSchema BranchSpecifies the name of the branch to use when looking up the schema in the Schema Registry property. If the chosen Schema Registry does not support branching, this value will be ignored.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseschema-access-strategySchema Access Strategyschema-nameschema-textSchema TextThe text of an Avro-formatted Schema${avro.schema}falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseschema-access-strategySchema Access Strategyschema-text-propertyschema-reference-readerSchema Reference ReaderService implementation responsible for reading FlowFile attributes or content to determine the Schema Reference Identifierorg.apache.nifi.schemaregistry.services.SchemaReferenceReaderorg.apache.nifinifi-standard-services-api-nar2.0.0-M4truefalsefalseNONEfalsefalseschema-access-strategySchema Access Strategyschema-reference-readercache-sizeCache SizeSpecifies how many Schemas should be cached1000truefalsefalseNONEfalsefalsecompression-typeCompression TypeThe type of compression for the file being written.UNCOMPRESSEDUNCOMPRESSEDUNCOMPRESSEDSNAPPYSNAPPYGZIPGZIPLZOLZOBROTLIBROTLILZ4LZ4ZSTDZSTDLZ4_RAWLZ4_RAWtruefalsefalseNONEfalsefalserow-group-sizeRow Group SizeThe row group size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsepage-sizePage SizeThe page size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsedictionary-page-sizeDictionary Page SizeThe dictionary page size used by the Parquet writer. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalsemax-padding-sizeMax Padding SizeThe maximum amount of padding that will be used to align row groups with blocks in the underlying filesystem. If the underlying filesystem is not a block filesystem like HDFS, this has no effect. The value is specified in the format of <Data Size> <Data Unit> where Data Unit is one of B, KB, MB, GB, TB.falsefalsetrueFLOWFILE_ATTRIBUTESfalsefalseenable-dictionary-encodingEnable Dictionary EncodingSpecifies whether dictionary encoding should be enabled for the Parquet writertruetruefalsefalsefalsefalsefalseNONEfalsefalseenable-validationEnable ValidationSpecifies whether validation should be enabled for the Parquet writertruetruefalsefalsefalsefalsefalseNONEfalsefalsewriter-versionWriter VersionSpecifies the version used by Parquet writerPARQUET_1_0PARQUET_1_0PARQUET_2_0PARQUET_2_0falsefalsefalseNONEfalsefalseavro-write-old-list-structureAvro Write Old List StructureSpecifies the value for 'parquet.avro.write-old-list-structure' in the underlying Parquet librarytruetruetruefalsefalsetruefalsefalseNONEfalsefalseavro-add-list-element-recordsAvro Add List Element RecordsSpecifies the value for 'parquet.avro.add-list-element-records' in the underlying Parquet librarytruetruetruefalsefalsetruefalsefalseNONEfalsefalseint96-fieldsINT96 FieldsList of fields with full path that should be treated as INT96 timestamps.falsefalsefalseNONEfalsefalseorg.apache.nifi.serialization.RecordSetWriterFactoryorg.apache.nifinifi-standard-services-api-nar2.0.0-M4
\ No newline at end of file