Comments (1)
It seems that FederatedDataset
class is only downloading the dataset from HF. Probably should be changed to enable a custom dataset in the HF's dataset
format be directly referenced instead of relying on downloading it.
flower/datasets/flwr_datasets/federated_dataset.py
Lines 43 to 44 in f78ef0a
flower/datasets/flwr_datasets/federated_dataset.py
Lines 237 to 239 in f78ef0a
Evidenced by:
fds.load_partition(1, split="train")
And the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[137], line 1
----> 1 fds.load_partition(1, split="train")
File ~...\Lib\site-packages\flwr_datasets\federated_dataset.py:131, in FederatedDataset.load_partition(self, partition_id, split)
108 """Load the partition specified by the idx in the selected split.
109
110 The dataset is downloaded only when the first call to `load_partition` or
(...)
128 Single partition from the dataset split.
129 """
130 if not self._dataset_prepared:
--> 131 self._prepare_dataset()
132 if self._dataset is None:
133 raise ValueError("Dataset is not loaded yet.")
File ~...\Lib\site-packages\flwr_datasets\federated_dataset.py:237, in FederatedDataset._prepare_dataset(self)
216 def _prepare_dataset(self) -> None:
217 """Prepare the dataset (prior to partitioning) by download, shuffle, replit.
218
219 Run only ONCE when triggered by load_* function. (In future more control whether
(...)
235 happen before the resplitting.
236 """
--> 237 self._dataset = datasets.load_dataset(
238 path=self._dataset_name, name=self._subset
239 )
240 if self._shuffle:
241 # Note it shuffles all the splits. The self._dataset is DatasetDict
242 # so e.g. {"train": train_data, "test": test_data}. All splits get shuffled.
243 self._dataset = self._dataset.shuffle(seed=self._seed)
File ~...\Lib\site-packages\datasets\load.py:2538, in load_dataset(path, name, data_dir, data_files, split, cache_dir, features, download_config, download_mode, verification_mode, ignore_verifications, keep_in_memory, save_infos, revision, token, use_auth_token, task, streaming, num_proc, storage_options, trust_remote_code, **config_kwargs)
2536 if data_files is not None and not data_files:
2537 raise ValueError(f"Empty 'data_files': '{data_files}'. It should be either non-empty or None (default).")
-> 2538 if Path(path, config.DATASET_STATE_JSON_FILENAME).exists():
2539 raise ValueError(
2540 "You are trying to load a dataset that was saved using `save_to_disk`. "
2541 "Please use `load_from_disk` instead."
2542 )
2544 if streaming and num_proc is not None:
File ~...\Lib\pathlib.py:1162, in Path.__init__(self, *args, **kwargs)
1159 msg = ("support for supplying keyword arguments to pathlib.PurePath "
1160 "is deprecated and scheduled for removal in Python {remove}")
1161 warnings._deprecated("pathlib.PurePath(**kwargs)", msg, remove=(3, 14))
-> 1162 super().__init__(*args)
File ~...\Lib\pathlib.py:373, in PurePath.__init__(self, *args)
371 path = arg
372 if not isinstance(path, str):
--> 373 raise TypeError(
374 "argument should be a str or an os.PathLike "
375 "object where __fspath__ returns a str, "
376 f"not {type(path).__name__!r}")
377 paths.append(path)
378 self._raw_paths = paths
TypeError: argument should be a str or an os.PathLike object where __fspath__ returns a str, not 'Dataset'
from flower.
Related Issues (20)
- Add Flower Baseline: FedPFT HOT 4
- Dynamic timeout settings
- Running flw baseline fedAVG_MNIST with recommended parameters on README gives accuracy better than the paper "best result" HOT 1
- Facing issue with Flower Simulation with ResNet18 and MNIST dataset HOT 2
- Out of Memory while learning on Cifar with 100 clients HOT 1
- macOS: pip install flwr[simulation] uninstalls flwr 1.8.0 and installs 0.16.0 HOT 6
- RayActorClientProxy.get_parameters() missing 1 required positional argument: 'group_id'
- How to let the client train using the initial model provided by the server?
- How to get consistent order of spawned clients between multiple runs with Flower's simulation?
- flutter support HOT 1
- how to show the full args in task page? HOT 1
- Perform strategy in the sampled clients and its subset at the same time
- Method Doesn't Exist HOT 3
- ConnectionError: Couldn't reach 'cifar10' on the Hub (ConnectionError)
- How can I disable federated evaluation and use only centralized evaluation? HOT 1
- Question about the type of parameter in the parameter conduction process of my model
- `num_cpus` for ray backend not respected in `run_simulation` HOT 4
- How to solve hydra. errors Error reported for OverrideParseException: LexerNoViableAltException: \ HOT 3
- Could not run FedPer Baseline HOT 8
- Possible to Dynamically Switch Federated Learning Algorithms Based on Performance? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flower.