Aller au contenu principal

Schema and dataptypes

Schema definition uses SQL syntax. The supported datatypes are described in the table below with the corresping mapping in Python and Java:

Data typeSQL nameValue type in PythonValue type in Java
ArrayTypeARRAY<element_type>list, tuple, or arrayjava.util.List
BinaryTypeBINARYbytearraybyte[]
BooleanTypeBOOLEANboolboolean or Boolean
ByteTypeBYTE, TINYINTint or longbyte or Byte
DateTypeDATEdatetime.datejava.time.LocalDate or java.sql.Date
DayTimeIntervalTypeINTERVAL DAY, INTERVAL DAY TO HOUR, INTERVAL DAY TO MINUTE, INTERVAL DAY TO SECONDdatetime.timedeltajava.time.Duration
DayTimeIntervalTypeINTERVAL HOUR, INTERVAL HOUR TO MINUTE, INTERVAL HOUR TO SECONDdatetime.timedeltajava.time.Duration
DayTimeIntervalType, INTERVAL MINUTE, INTERVAL MINUTE TO SECOND, INTERVAL SECONDdatetime.timedeltajava.time.Duration
DecimalTypeDECIMAL, DEC, NUMERICdecimal.Decimaljava.math.BigDecimal
DoubleTypeDOUBLEfloatdouble or Double
FloatTypeFLOAT, REALint or longfloat or Float
IntegerTypeINT, INTEGERint or longint or Integer
LongTypeLONG, BIGINTint or longlong or Long
MapTypeMAP<key_type, value_type>dictjava.util.Map
ShortTypeSHORT, SMALLINTint or longshort or Short
StringTypeSTRINGstringString
StructTypeSTRUCT<field1_name: field1_type, field2_name: field2_type, …>list or tupleorg.apache.spark.sql.Row
TimestampNTZTypeTIMESTAMP_NTZdatetime.datetimejava.time.LocalDateTime
TimestampTypeTIMESTAMP, TIMESTAMP_LTZdatetime.datetimejava.time.Instant or java.sql.Timestamp
YearMonthIntervalTypeINTERVAL YEAR, INTERVAL YEAR TO MONTH, INTERVAL MONTHdatetime.timedeltajava.time.Period