You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the list of things that are in pandas 2.2 release notes that need to be addressed in pandas-stubs. I have removed the sections Performance improvements and Bug fixes.
PR's welcome. If you do a PR, check off the item and put a link to the PR that closed it. One PR can address multiple issues.
Some of these may already have been taken care of, so if so, check them off and indicate with a comment such as "previously complete"
Upcoming changes in pandas 3.0
Dedicated string data type (backed by Arrow) by default
Enhancements
ADBC Driver support in to_sql and read_sql
Create a pandas Series based on one or more conditions
Series.struct accessor for PyArrow structured data
Series.list accessor for PyArrow list data
Calamine engine for read_excel
~DataFrame.to_sql with method parameter set to multi works with Oracle on the backend
read_sas returns datetime64 dtypes with resolutions better matching those stored natively in SAS, and avoids returning object-dtype in cases that cannot be stored with datetime64[ns] dtype (/ENH: non-nano datetime64s for read_sas pandas#56127)
DataFrame.apply now allows the usage of numba (via engine="numba") to JIT compile the passed function, allowing for potential speedups (/ENH: numba engine in df.apply pandas#54666)
Deprecated pandas.api.types.is_interval and pandas.api.types.is_period, use isinstance(obj, pd.Interval) and isinstance(obj, pd.Period) instead (/DEPR: is_decimal, is_interval pandas#55264)
Deprecated .DataFrameGroupBy.fillna and .SeriesGroupBy.fillna; use .DataFrameGroupBy.ffill, .DataFrameGroupBy.bfill for forward and backward filling or .DataFrame.fillna to fill with a single value (or the Series equivalents) (/DEPR: groupby.fillna pandas#55718)
Deprecated behavior of Index.insert with an object-dtype index silently performing type inference on the result, explicitly call result.infer_objects(copy=False) for the old behavior instead (/API: Index.insert too much dtype inference pandas#51363)
Deprecated the :attr:.DataFrameGroupBy.grouper and :attr:SeriesGroupBy.grouper; these attributes will be removed in a future version of pandas (/DEPR: groupby.grouper pandas#56521)
Deprecated the behavior of DataFrame.replace and Series.replace with CategoricalDtype; in a future version replace will change the values while preserving the categories. To change the categories, use ser.cat.rename_categories instead (/DEPR/API: Series[categorical].replace behavior pandas#55147)
Deprecated the behavior of Series.value_counts and Index.value_counts with object dtype; in a future version these will not perform dtype inference on the resulting Index, do result.index = result.index.infer_objects() to retain the old behavior (/DEPR: dtype inference in value_counts pandas#56161)
Deprecated the extension test classes BaseNoReduceTests, BaseBooleanReduceTests, and BaseNumericReduceTests, use BaseReduceTests instead (/DEPR: BaseNoReduceTests pandas#54663)
Deprecated the option mode.data_manager and the ArrayManager; only the BlockManager will be available in future versions (/DEPR: ArrayManager pandas#55043)
This is the list of things that are in pandas 2.2 release notes that need to be addressed in pandas-stubs. I have removed the sections
Performance improvementsandBug fixes.PR's welcome. If you do a PR, check off the item and put a link to the PR that closed it. One PR can address multiple issues.
Some of these may already have been taken care of, so if so, check them off and indicate with a comment such as "previously complete"
Upcoming changes in pandas 3.0
Enhancements
read_excel~DataFrame.to_sqlwith method parameter set tomultiworks with Oracle on the backendSeries.attrs/ :attr:DataFrame.attrsnow uses a deepcopy for propagatingattrs(/BUG:copy.deepcopy()doesn't deepcopy the metadata in.attrspandas#54134).get_dummiesnow returning extension dtypesbooleanorbool[pyarrow]that are compatible with the input dtype (/BUG:pd.get_dummiesshould returnbool[pyarrow]types pandas#56273)read_csvnow supportson_bad_linesparameter withengine="pyarrow"(/ENH: Pandas 2.0 with pyarrow engine add the argument like 'skip_bad_lines=True' pandas#54480)read_sasreturnsdatetime64dtypes with resolutions better matching those stored natively in SAS, and avoids returning object-dtype in cases that cannot be stored withdatetime64[ns]dtype (/ENH: non-nano datetime64s for read_sas pandas#56127)read_spssnow returns aDataFramethat stores the metadata in :attr:DataFrame.attrs(/read_spss doens't return the metadata pandas#54264)tseries.api.guess_datetime_formatis now part of the public API (/ENH: make guess_datetime_format public pandas#54727)DataFrame.applynow allows the usage of numba (viaengine="numba") to JIT compile the passed function, allowing for potential speedups (/ENH: numba engine in df.apply pandas#54666)ExtensionArray._explodeinterface method added to allow extension type implementations of theexplodemethod (/ENH: make_explodea method of theExtensionArrayinterface pandas#54833)ExtensionArray.duplicatedadded to allow extension type implementations of theduplicatedmethod (/ENH/PERF: add ExtensionArray.duplicated pandas#55255)Series.ffill,Series.bfill,DataFrame.ffill, andDataFrame.bfillhave gained the argumentlimit_area; 3rd party.ExtensionArrayauthors need to add this argument to the method_pad_or_backfill(/ENH: add limit_area argument to ffill() method as interpolate( method='ffill', limit_area='inside') as been deprecated pandas#56492)read_only,data_onlyandkeep_linksarguments to openpyxl usingengine_kwargsofread_excel(/ENH: _openpyxl.py load_workbook allow to modify the read_only, data_only and keep_links parameters using engine_kwargs pandas#55027)Series.interpolateandDataFrame.interpolateforArrowDtypeand masked dtypes (/.interpolate()Method Incompatible withfloat[pyarrow]Dtype pandas#56267)Series.value_counts(/ENH: Implement masked algorithm for value_counts pandas#54984)Series.dtmethods and attributes forArrowDtypewithpyarrow.durationtype (/BUG: Cannot access Timedelta properties with Arrow Backend pandas#52284)Series.str.extractforArrowDtype(/BUG:str.extractMethod Not Implemented forpd.ArrowDtype(pa.string())pandas#56268)DatetimeIndex.to_periodwith frequencies which are not supported as period frequencies, such as"BMS"(/ENH: Raise TypeError when converting DatetimeIndex to PeriodIndex with invalid period frequency pandas#56243)Periodwith invalid offsets such as"QS"(/BUG: QuarterBegin Does not work with Period pandas#55785)string[pyarrow]andstring[pyarrow_numpy]now both utilize thelarge_stringtype from PyArrow to avoid overflow for long columns (/BUG: new string dtype fails with >2 GB of data in a single column pandas#56259)Notable bug fixes
check_exactnow only takes effect for floating-point dtypes intesting.assert_frame_equalandtesting.assert_series_equal. In particular, integer dtypes are always checked exactly (/BUG: assert_series_equal not raising on unequal series? pandas#55882)Deprecations
M,Q,Y, etc. in favour ofME,QE,YE, etc. for offsetsTimedelta.resolution_stringto returnh,min,s,ms,us, andnsinstead ofH,T,S,L,U, andN, for compatibility with respective deprecations in frequency aliases (/BUG: Either incorrect unit validation for 'T' in to_timedelta() or incorrect documentation pandas#52536)offsets.Day.delta, :attr:offsets.Hour.delta, :attr:offsets.Minute.delta, :attr:offsets.Second.delta, :attr:offsets.Milli.delta, :attr:offsets.Micro.delta, :attr:offsets.Nano.delta, usepd.Timedelta(obj)instead (/DEPR: Tick.delta pandas#55498)pandas.api.types.is_intervalandpandas.api.types.is_period, useisinstance(obj, pd.Interval)andisinstance(obj, pd.Period)instead (/DEPR: is_decimal, is_interval pandas#55264)read_gbqandDataFrame.to_gbq. Usepandas_gbq.read_gbqandpandas_gbq.to_gbqinstead https://pandas-gbq.readthedocs.io/en/latest/api.html (/DEPR: read_gbq, DataFrame.to_gbq pandas#55525).DataFrameGroupBy.fillnaand.SeriesGroupBy.fillna; use.DataFrameGroupBy.ffill,.DataFrameGroupBy.bfillfor forward and backward filling or.DataFrame.fillnato fill with a single value (or the Series equivalents) (/DEPR: groupby.fillna pandas#55718)DateOffset.is_anchored, useobj.n == 1for non-Tick subclasses (for Tick this was always False) (/DOC: DateOffset.is_anchored track down intention, fix docstring pandas#55388)DatetimeArray.__init__andTimedeltaArray.__init__, usearrayinstead (/DEPR: DTA/TDA.__init__ pandas#55623)Index.format, useindex.astype(str)orindex.map(formatter)instead (/DEPR: Index.format? pandas#55413)Series.ravel, the underlying array is already 1D, so ravel is not necessary (/DEPR: Deprecate Series.ravel pandas#52511)ravelinIndexandSeriespandas-dev/pandas#36900 pandas-dev/pandas#52511 #1613Series.resampleandDataFrame.resamplewith aPeriodIndex(and the 'convention' keyword), convert toDatetimeIndex(with.to_timestamp()) before resampling instead (/DEPR: Resample with PeriodIndex? pandas#53481). Note: this deprecation was later undone in pandas 2.3.3 (/QST: FutureWarning: Resampling with a PeriodIndex is deprecated, how to resample now? pandas#57033)Series.view, useSeries.astypeinstead to change the dtype (/DEPR: Series.view pandas#20251)offsets.Tick.is_anchored, useFalseinstead (/DOC: DateOffset.is_anchored track down intention, fix docstring pandas#55388)core.internalsmembersBlock,ExtensionBlock, andDatetimeTZBlock, use public APIs instead (/DEPR: deprecate exposing blocks in core.internals pandas#55139)year,month,quarter,day,hour,minute, andsecondkeywords in thePeriodIndexconstructor, usePeriodIndex.from_fieldsinstead (/DEPR: PeriodIndex.__new__ accepting ordinals, fields pandas#55960)Index.view, call without any arguments instead (/API/DEPR: Index.view _typ check, return type pandas#55709)periodsargument indate_range,timedelta_range,period_range, andinterval_range(/DEPR: fractional periods in date_range, timedelta_range, period_range… pandas#56036)DataFrame.to_clipboard(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_csvexceptpath_or_buf(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_dict(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_excelexceptexcel_writer(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_gbqexceptdestination_table(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_hdfexceptpath_or_buf(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_htmlexceptbuf(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_jsonexceptpath_or_buf(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_latexexceptbuf(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_markdownexceptbuf(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_parquetexceptpath(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_pickleexceptpath(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_stringexceptbuf(/DEPR: Positional arguments in to_* I/O methods pandas#54229)DataFrame.to_xmlexceptpath_or_buffer(/DEPR: Positional arguments in to_* I/O methods pandas#54229)BlockManagerobjects toDataFrameorSingleBlockManagerobjects toSeries(/DEPR: accepting Manager objects in DataFrame/Series pandas#52419)Index.insertwith an object-dtype index silently performing type inference on the result, explicitly callresult.infer_objects(copy=False)for the old behavior instead (/API: Index.insert too much dtype inference pandas#51363)Series.isinandIndex.isinwithdatetime64,timedelta64, andPeriodDtypedtypes (/DEPR: casting in DatetimeLikeArrayMixin.isin pandas#53111)Index,SeriesandDataFrameconstructors when giving a pandas input, call.infer_objectson the input to keep the current behavior (/DEPR: Series and Index shouldn't do inference on pandas objects pandas#56012)Indexinto aDataFrame, cast explicitly instead (/DEPR: Disallow dtype inference when setting Index into DataFrame pandas#56102).DataFrameGroupBy.applyand.DataFrameGroupBy.resample; passinclude_groups=Falseto exclude the groups (/API: way to exclude the grouped column with apply pandas#7155)Indexwith a boolean indexer of length zero (/BUG: inconsistency in Index.__getitem__ for Numpy and non-numpy dtypes pandas#55820).DataFrameGroupBy.get_groupor.SeriesGroupBy.get_groupwhen grouping by a length-1 list-like (/Groupby on single key should be accessible by tuple of length 1 pandas#25971)ASdenoting frequency inYearBeginand stringsAS-DEC,AS-JAN, etc. denoting annual frequencies with various fiscal year starts (/DEPR: deprecate the alias 'A' in favour of 'Y' for year end frequency pandas#54275)Adenoting frequency inYearEndand stringsA-DEC,A-JAN, etc. denoting annual frequencies with various fiscal year ends (/DEPR: deprecate the alias 'A' in favour of 'Y' for year end frequency pandas#54275)BASdenoting frequency inBYearBeginand stringsBAS-DEC,BAS-JAN, etc. denoting annual frequencies with various fiscal year starts (/DEPR: deprecate the alias 'A' in favour of 'Y' for year end frequency pandas#54275)BAdenoting frequency inBYearEndand stringsBA-DEC,BA-JAN, etc. denoting annual frequencies with various fiscal year ends (/DEPR: deprecate the alias 'A' in favour of 'Y' for year end frequency pandas#54275)H,BH, andCBHdenoting frequencies inHour,BusinessHour,CustomBusinessHour(/BUG: Either incorrect unit validation for 'T' in to_timedelta() or incorrect documentation pandas#52536)H,S,U, andNdenoting units into_timedelta(/BUG: Either incorrect unit validation for 'T' in to_timedelta() or incorrect documentation pandas#52536)H,T,S,L,U, andNdenoting units inTimedelta(/BUG: Either incorrect unit validation for 'T' in to_timedelta() or incorrect documentation pandas#52536)T,S,L,U, andNdenoting frequencies inMinute,Second,Milli,Micro,Nano(/BUG: Either incorrect unit validation for 'T' in to_timedelta() or incorrect documentation pandas#52536)read_csvalong with thekeep_date_colkeyword (/DEPR: read_csv keywords: keep_date_col, delim_whitespace pandas#55569).DataFrameGroupBy.grouperand :attr:SeriesGroupBy.grouper; these attributes will be removed in a future version of pandas (/DEPR: groupby.grouper pandas#56521).Groupingattributesgroup_index,result_index, andgroup_arraylike; these will be removed in a future version of pandas (/DEPR: Certain Grouper and Grouping attributes pandas#56148)delim_whitespacekeyword inread_csvandread_table, usesep="\\s+"instead (/DEPR: read_csv keywords: keep_date_col, delim_whitespace pandas#55569)errors="ignore"option into_datetime,to_timedelta, andto_numeric; explicitly catch exceptions instead (/DEPR: deprecate errors='ignore' in to_datetime and make output dtype predictable pandas#54467)fastpathkeyword in theSeriesconstructor (/CLN: remove fastpath & verify_integrity from constructors pandas#20110)kindkeyword inSeries.resampleandDataFrame.resample, explicitly cast the object'sindexinstead (/DEPR: kind keyword in resample pandas#55895)ordinalkeyword inPeriodIndex, usePeriodIndex.from_ordinalsinstead (/DEPR: PeriodIndex.__new__ accepting ordinals, fields pandas#55960)unitkeyword inTimedeltaIndexconstruction, useto_timedeltainstead (/DEPR: DatetimeIndex/TimedeltaIndex constructor keywords pandas#55499)verbosekeyword inread_csvandread_table(/DEPR: read_csv keywords: keep_date_col, delim_whitespace pandas#55569)DataFrame.replaceandSeries.replacewithCategoricalDtype; in a future version replace will change the values while preserving the categories. To change the categories, useser.cat.rename_categoriesinstead (/DEPR/API: Series[categorical].replace behavior pandas#55147)Series.value_countsandIndex.value_countswith object dtype; in a future version these will not perform dtype inference on the resultingIndex, doresult.index = result.index.infer_objects()to retain the old behavior (/DEPR: dtype inference in value_counts pandas#56161)observed=FalseinDataFrame.pivot_table; will beTruein a future version (/DEPR: observed=False default for DataFrame.pivot_table pandas#56236)BaseNoReduceTests,BaseBooleanReduceTests, andBaseNumericReduceTests, useBaseReduceTestsinstead (/DEPR: BaseNoReduceTests pandas#54663)mode.data_managerand theArrayManager; only theBlockManagerwill be available in future versions (/DEPR: ArrayManager pandas#55043)DataFrame.stack; specifyfuture_stack=Trueto adopt the future version (/DEPR/BUG: DataFrame.stack including all null rows when stacking multiple levels pandas#53515)