cloudpathlib Changelog¶
UNRELEASED¶
v0.19.0 (2024-08-29)¶
- Fixed an error that occurred when loading and dumping
CloudPath
objects using pickle multiple times. (Issue #450, PR #454, thanks to @kujenga) - Fixed typo in
FileCacheMode
where values were being filled by environment variableCLOUPATHLIB_FILE_CACHE_MODE
instead ofCLOUDPATHLIB_FILE_CACHE_MODE
. (PR #424, thanks to @mynameisfiber) - Fixed
CloudPath
cleanup viaCloudPath.__del__
whenClient
encounters an exception during initialization and does not create afile_cache_mode
attribute. (Issue #372, thanks to @bryanwweber) - Removed support for Python 3.7 and pinned minimal
boto3
version to Python 3.8+ versions. (PR #407) - Changed
GSClient
to use the nativeexists()
method from the Google Cloud Storage SDK. (PR #420, thanks to @bachya) - Changed default clients to be lazily instantiated (Issue #428, PR #432)
- Fixed
download_to
to check for the existence of the cloud file (Issue #430, PR #433) - Added env vars
CLOUDPATHLIB_FORCE_OVERWRITE_FROM_CLOUD
andCLOUDPATHLIB_FORCE_OVERWRITE_TO_CLOUD
. (Issue #393, PR #437) - Fixed
glob
forcloudpathlib.local.LocalPath
and subclass implementations to match behavior of cloud versions for parity in testing. (Issue #415, PR #436) - Changed how
cloudpathlib.local.LocalClient
and subclass implementations track the default local storage directory (used to simulate the cloud) used when no local storage directory is explicitly provided. (PR #436, PR #462)- Changed
LocalClient
so that client instances using the default storage access the default local storage directory through theget_default_storage_dir
rather than having an explicit reference to the path set at instantiation. This means that callingget_default_storage_dir
will reset the local storage for all clients using the default local storage, whether the client has already been instantiated or is instantiated after resetting. This fixes unintuitive behavior wherereset_local_storage
did not reset local storage when using the default client. (Issue #414) - Added a new
local_storage_dir
property toLocalClient
. This will return the current local storage directory used by that client instance. by reference through the `get_default_ rather than with an explicit.
- Changed
- Refined the return type annotations for
CloudPath.open()
to match the behavior ofpathlib.Path.open()
. The method now returns specific types (TextIOWrapper
,FileIO
,BufferedRandom
,BufferedWriter
,BufferedReader
,BinaryIO
,IO[Any]
) based on the providedmode
,buffering
, andencoding
arguments. (Issue #465, PR #464) - Added Azure Data Lake Storage Gen2 support (Issue #161, PR #450), thanks to @M0dEx for PR #447 and PR #449
v0.18.1 (2024-02-26)¶
- Fixed import error due to incompatible
google-cloud-storage
by not usingtransfer_manager
if it is not available. (Issue #408, PR #410)
Includes all changes from v0.18.0.
Note: This is the last planned Python 3.7 compatible release version.
0.18.0 (2024-02-25) (Yanked)¶
- Implement sliced downloads in GSClient. (Issue #387, PR #389)
- Implement
as_url
with presigned parameter for all backends. (Issue #235, PR #236) - Stream to and from Azure Blob Storage. (PR #403)
- Implement
file:
URI scheme support forAnyPath
. (Issue #401, PR #404)
Note: This version was yanked due to incompatibility with google-cloud-storage <2.7.0 that causes an import error.
0.17.0 (2023-12-21)¶
- Fix
S3Client
cleanup viaClient.__del__
whenS3Client
encounters an exception during initialization. (Issue #372, PR #373, thanks to @bryanwweber) - Skip mtime checks during upload when force_overwrite_to_cloud is set to improve upload performance. (Issue #379, PR #380, thanks to @Gilthans)
v0.16.0 (2023-10-09)¶
- Add "CloudPath" as return type on
__init__
for mypy issues. (Issue #179, PR #342) - Add
with_stem
to all path types when python version supports it (>=3.9). (Issue #287, PR #290, thanks to @Gilthans) - Add
newline
parameter to thewrite_text
method to align topathlib
functionality as of Python 3.10. PR #362, thanks to @pricemg. - Add support for Python 3.12 (PR #364)
- Add
CLOUDPATHLIB_LOCAL_CACHE_DIR
env var for setting local_cache_dir default for clients (Issue #352, PR #357) - Add
CONTRIBUTING.md
instructions for contributors (Issue #213, PR #367)
v0.15.1 (2023-07-12)¶
- Compatibility with pydantic >= 2.0.0. (PR #349)
v0.15.0 (2023-06-16)¶
- Changed return type for
CloudPathMeta.__call__
to fix problems with pyright/pylance (PR #330) - Make
CloudPath.is_valid_cloudpath
a TypeGuard so that type checkers can know the subclass ifis_valid_cloudpath
is called (PR #337) - Added
follow_symlinks
tostat
for 3.11.4 compatibility (see bpo 39906) - Add
follow_symlinks
tois_dir
implementation for CPythonglob
compatibility (see CPython PR #104512)
v0.14.0 (2023-05-13)¶
- Changed to pyproject.toml-based build.
- Changed type hints from custom type variable
DerivedCloudPath
totyping.Self
(PEP 673). This adds a dependency on the typing-extensions backport package from Python versions lower than 3.11. - Fixed a runtime key error when an S3 object does not have the
Content-Type
metadata set. (Issue #331, PR #332)
v0.13.0 (2023-02-15)¶
- Implement
file_cache_mode
s to give users finer-grained control over when and how the cache is cleared. (Issue #10, PR #314) - Speed up listing directories for Google Cloud Storage. (PR #318)
- Add compatibility for Python 3.11 (PR #317)
v0.12.1 (2023-01-04)¶
- Fix glob logic for buckets; add regression test; add error on globbing all buckets (Issue #311, PR #312)
v0.12.0 (2022-12-30)¶
- API Change:
S3Client
supports anextra_args
kwarg now to pass extra args down toboto3
functions; this enables Requester Pays bucket access and bucket encryption. (Issues #254, #180; PR #307) - Speed up glob! (Issue #274, PR #304)
- Ability to list buckets/containers a user has access to. (Issue #48, PR #307)
- Remove overly specific status check and assert in production code on remove. (Issue #212, PR #307)
- Update docs, including accessing public buckets. (Issue #271, PR #307)
v0.11.0 (2022-12-18)¶
- API change: Add
ignore
parameter toCloudPath.copytree
in order to matchshutil
API. (Issue #145, PR #272) - Use the V2 version for listing objects
list_objects_v2
inS3Client
. (Issue #155, PR #302) - Add abilty to use
.exists
to check for a raw bucket/container (no additional path components). (Issue #291, PR #302) - Prevent data loss when renaming by skipping files that would be renamed to the same thing. (Issue #277, PR #278)
- Speed up common
glob
/rglob
patterns. (Issue #274, PR #276)
v0.10.0 (2022-08-18)¶
- API change: Make
stat
on base class method instead of property to followpathlib
(Issue #234, PR #250) - Fixed "S3Path.exists() returns True on partial matches." (Issue #208, PR #244)
- Make
AnyPath
subclass ofAnyPath
(Issue #246, PR #251) - Skip docstrings if not present to avoid failing under
-00
(Issue #238, PR #249) - Add
py.typed
file so mypy runs (Issue #243, PR #248)
v0.9.0 (2022-06-03)¶
- Added
absolute
toCloudPath
(does nothing asCloudPath
is always absolute) (PR #230) - Added
resolve
toCloudPath
(does nothing asCloudPath
is resolved in advance) (Issue #151, PR #230) - Added
relative_to
toCloudPath
which returns aPurePosixPath
(Issue #149, PR #230) - Added
is_relative_to
toCloudPath
(Issue #149, PR #230) - Added
is_absolute
toCloudPath
(always true asCloudPath
is always absolute) (PR #230) - Accept and delegate
read_text
parameters to cached file (PR #230) - Added
exist_ok
parameter totouch
(PR #230) - Added
missing_ok
parameter tounlink
, which defaults to True. This diverges from pathlib to maintain backward compatibility (PR #230) - Fixed missing root object entries in documentation's Intersphinx inventory (Issue #211, PR #237)
v0.8.0 (2022-05-19)¶
- Fixed pickling of
CloudPath
objects not working. (Issue #223, PR #224) - Added functionality to [push the MIME (media) type to the content type property on cloud providers by default. (Issue #222, PR #226)
v0.7.1 (2022-04-06)¶
- Fixed inadvertent inclusion of tests module in package. (Issue #173, PR #219)
v0.7.0 (2022-02-16)¶
- Fixed
glob
andrglob
functions by using pathlib's globbing logic rather than fnmatch. (Issue #154) - Fixed
iterdir
to not include self. (Issue #15) - Fixed error when calling
suffix
andsuffixes
on a cloud path with no suffix. (Issue #120) - Changed
parents
return type from list to tuple, to better match pathlib's tuple-like_PathParents
return type. - Remove support for Python 3.6. Issue #186
v0.6.5 (2022-01-25)¶
- Fixed error when "directories" created on AWS S3 were reported as files. (Issue #148, PR #190)
- Fixed bug where GCE machines can instantiate default client, but we don't attempt it. (Issue #191
- Support
AWS_ENDPOINT_URL
environment variable to set theendpoint_url
forS3Client
. (PR #193)
v0.6.4 (2021-12-29)¶
- Fixed error where
BlobProperties
type hint causes import error if Azure dependencies not installed.
v0.6.3 (2021-12-29)¶
- Fixed error when using
rmtree
on nested directories for Google Cloud Storage and Azure Blob Storage. (Issue #184, PR #185) - Fixed broken builds due mypy errors in azure dependency (PR #177)
- Fixed dev tools for building and serving documentation locally (PR #178)
v0.6.2 (2021-09-20)¶
- Fixed error when importing
cloudpathlib
for missingbotocore
dependency when not installed with S3 dependencies. (PR #168)
v0.6.1 (2021-09-17)¶
- Fixed absolute documentation URLs to point to the new versioned documentation pages.
- Fixed bug where
no_sign_request
couldn't be used to download files since our code required list permissions to the bucket to do so. (Issue #169, PR #168).
v0.6.0 (2021-09-07)¶
- Added
no_sign_request
parameter toS3Client
instantiation for anonymous requests for public resources on S3. See documentation for more details. (#164)
v0.5.0 (2021-08-31)¶
- Added
boto3_transfer_config
parameter toS3Client
instantiation, which allows passing aboto3.s3.transfer.TransferConfig
object and is useful for controlling multipart and thread use in uploads and downloads. See documentation for more details. (#150)
v0.4.1 (2021-05-29)¶
- Added support for custom S3-compatible object stores. This functionality is available via the
endpoint_url
keyword argument when instantiating anS3Client
instance. See documentation for more details. (#138 thanks to @YevheniiSemendiak) - Added
CloudPath.upload_from
which uploads the passed path to this CloudPath (issuse #58) - Added support for common file transfer functions based on
shutil
. Issue #108. PR #142. CloudPath.copy
copy a file from one location to another. Can be cloud -> local or cloud -> cloud. Ifclient
is not the same, the file transits through the local machine.CloudPath.copytree
reucrsively copy a directory from one location to another. Can be cloud -> local or cloud -> cloud. UsesCloudPath.copy
so ifclient
is not the same, the file transits through the local machine.
v0.4.0 (2021-03-13)¶
- Added rich comparison operator support to cloud paths, which means you can now use them with
sorted
. (#129) - Added polymorphic class
AnyPath
which creates a cloud path orpathlib.Path
instance appropriately for an input filepath. See new documentation for details and example usage. (#130) - Added integration with Pydantic. See new documentation for details and example usage. (#130)
- Exceptions: (#131)
- Changed all custom
cloudpathlib
exceptions to be located in newcloudpathlib.exceptions
module. - Changed all custom
cloudpathlib
exceptions to subclass from new baseCloudPathException
. This allows for easy catching of any custom exception fromcloudpathlib
. - Changed all custom exceptions names to end with
Error
as recommended by PEP 8. - Changed various functions to throw new
CloudPathFileExistsError
,CloudPathIsADirectoryError
orCloudPathNotADirectoryError
exceptions instead of a genericValueError
. - Removed exception exports from the root
cloudpathlib
package namespace. Import fromcloudpathlib.exceptions
instead if needed.
- Changed all custom
- Fixed
download_to
method to handle case when source is a file and destination is a directory. (#121 thanks to @genziano) - Fixed bug where
hash(...)
of a cloud path was not consistent with the equality operator. (#129) - Fixed
AzureBlobClient
instantiation to throw new errorMissingCredentialsError
when no credentials are provided, instead ofAttributeError
.LocalAzureBlobClient
has also been changed to accordingly error under those conditions. (#131) - Fixed
GSClient
to instantiate as anonymous with public access only when instantiated with no credentials, instead of erroring. (#131)
v0.3.0 (2021-01-29)¶
- Added a new module
cloudpathlib.local
with utilities for mocking cloud paths in tests. The module has "Local" substitute classes that use the local filesystem in place of cloud storage. See the new documentation article "Testing code that uses cloudpathlib" to learn more about how to use them. (#107)
v0.2.1 (2021-01-25)¶
- Fixed bug where a
NameError
was raised if the Google Cloud Storage dependencies were not installed (even if using a different storage provider).
v0.2.0 (2021-01-23)¶
- Added support for Google Cloud Storage. Instantiate with URIs prefixed by
gs://
or explicitly using theGSPath
class. (#113 thanks to @wolfgangwazzlestrauss) - Changed backend logic to reduce number of network calls to cloud. This should result in faster cloud path operations, especially when dealing with many small files. (#110, #111)
v0.1.2 (2020-11-14)¶
- Fixed
CloudPath
instantiation so that reinstantiating with an existingCloudPath
instance will reuse the same client, if a new client is not explicitly passed. This addresses the edge case of non-idempotency when reinstantiating aCloudPath
instance with a non-default client. (#104)
v0.1.1 (2020-10-15)¶
- Fixed a character-encoding bug when building from source on Windows. (#98)
v0.1.0 (2020-10-06)¶
- Initial release of cloudpathlib with support for Amazon S3 and Azure Blob Storage! 🎉