GithubHelp home page GithubHelp logo

wywongbd / autocluster Goto Github PK

View Code? Open in Web Editor NEW
55.0 55.0 22.0 146.38 MB

AutoML for clustering models in sklearn.

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
automl bayesian-optimization clustering hyperparameter-optimization

autocluster's People

Contributors

coffeetumbler avatar dependabot[bot] avatar j-chan-hkust avatar renxida avatar snuseungjun avatar wywongbd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

autocluster's Issues

Matplotlib known build error on version 3.0.3, Upgraded Recommended

matplotlib/matplotlib#13555

When running a base pip install I am consistently having the same issue.

` pip3 --no-cache-dir install autocluster
Looking in indexes: https://pypi.org/simple, https://1205d49dc47b4644d672f57e74f850e6342693e3f0b8cf0b:****@packagecloud.io/agrible/internal/pypi/simple
Collecting autocluster
Downloading autocluster-0.5.2-py3-none-any.whl (35 kB)
Requirement already satisfied: six>=1.5.0 in /usr/lib/python3/dist-packages (from autocluster) (1.14.0)
Collecting matplotlib==3.0.3
Downloading matplotlib-3.0.3.tar.gz (36.6 MB)
|████████████████████████████████| 36.6 MB 3.1 MB/s
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-ecu3m8rg/matplotlib/setup.py'"'"'; file='"'"'/tmp/pip-install-ecu3m8rg/matplotlib/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-ecu3m8rg/matplotlib/pip-egg-info
cwd: /tmp/pip-install-ecu3m8rg/matplotlib/
Complete output (48 lines):
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-ecu3m8rg/matplotlib/setup.py", line 225, in
msg = pkg.install_help_msg()
File "/tmp/pip-install-ecu3m8rg/matplotlib/setupext.py", line 650, in install_help_msg
release = platform.linux_distribution()[0].lower()
AttributeError: module 'platform' has no attribute 'linux_distribution'
============================================================================
Edit setup.cfg to change the build options

BUILDING MATPLOTLIB
            matplotlib: yes [3.0.3]
                python: yes [3.8.2 (default, Apr 27 2020, 15:53:34)  [GCC
                        9.3.0]]
              platform: yes [linux]

REQUIRED DEPENDENCIES AND EXTENSIONS
                 numpy: yes [version 1.18.5]
      install_requires: yes [handled by setuptools]
                libagg: yes [pkg-config information for 'libagg' could not
                        be found. Using local copy.]
              freetype: no  [The C/C++ header for freetype2 (ft2build.h)
                        could not be found.  You may need to install the
                        development package.]
                   png: no  [pkg-config information for 'libpng' could not
                        be found.]
                 qhull: yes [pkg-config information for 'libqhull' could not
                        be found. Using local copy.]

OPTIONAL SUBPACKAGES
           sample_data: yes [installing]
              toolkits: yes [installing]
                 tests: no  [skipping due to configuration]
        toolkits_tests: no  [skipping due to configuration]

OPTIONAL BACKEND EXTENSIONS
                   agg: yes [installing]
                 tkagg: yes [installing; run-time loading from Python Tcl /
                        Tk]
                macosx: no  [Mac OS-X only]
             windowing: no  [Microsoft Windows only]

OPTIONAL PACKAGE DATA
                  dlls: no  [skipping due to configuration]

============================================================================
                        * The following required packages can not be built:
                        * freetype, png
----------------------------------------

ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.`

sklearn is deprecated, needs to be changed in requirements.txt

When trying to install autocluster I get this error and can't continue with the installation:

Collecting sklearn
  Downloading sklearn-0.0.post11.tar.gz (3.6 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'error'
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [18 lines of output]
      The 'sklearn' PyPI package is deprecated, use 'scikit-learn'
      rather than 'sklearn' for pip commands.
      
      Here is how to fix this error in the main use cases:
      - use 'pip install scikit-learn' rather than 'pip install sklearn'
      - replace 'sklearn' by 'scikit-learn' in your pip requirements files
        (requirements.txt, setup.py, setup.cfg, Pipfile, etc ...)
      - if the 'sklearn' package is used by one of your dependencies,
        it would be great if you take some time to track which package uses
        'sklearn' instead of 'scikit-learn' and report it to their issue tracker
      - as a last resort, set the environment variable
        SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True to avoid this error
      
      More information is available at
      https://github.com/scikit-learn/sklearn-pypi-package
      
      If the previous advice does not cover your use case, feel free to report it at
      https://github.com/scikit-learn/sklearn-pypi-package/issues/new
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

Consistent Core Dump

Also, when attempting to the run the system at all I am consistently running into a core dump issue:

`>>> from autocluster import AutoCluster, get_evaluator

X, y = datasets.make_blobs(n_samples=1000,
... n_features=2,
... centers=6,
... cluster_std=0.5,
... shuffle=True, random_state=27)
dummy_df = pd.DataFrame(X)
dummy_df.head(5)
0 1
0 7.742343 -6.603815
1 8.726121 6.433689
2 -1.427522 5.393546
3 8.801468 -5.185687
4 -1.404321 9.526536
cluster = AutoCluster(logger=None)
fit_params = {
... "df": dummy_df,
... "cluster_alg_ls": [
... 'KMeans', 'GaussianMixture', 'MiniBatchKMeans'
... ],
... "dim_reduction_alg_ls": [
... 'NullModel'
... ],
... "optimizer": 'smac',
... "n_evaluations": 40,
... "run_obj": 'quality',
... "seed": 27,
... "cutoff_time": 10,
... "preprocess_dict": {
... "numeric_cols": list(range(2)),
... "categorical_cols": [],
... "ordinal_cols": [],
... "y_col": []
... },
... "evaluator": get_evaluator(evaluator_ls = ['silhouetteScore',
... 'daviesBouldinScore',
... 'calinskiHarabaszScore'],
... weights = [1, 1, 1],
... clustering_num = None,
... min_proportion = .01,
... min_relative_proportion='default'),
... "n_folds": 3,
... "warmstart": False,
... "verbose_level": 1,
... }
result_dict = cluster.fit(**fit_params)
/home/wolvez/.local/lib/python3.8/site-packages/sklearn/ensemble/_iforest.py:252: FutureWarning: 'behaviour' is deprecated in 0.22 and will be removed in 0.24. You should not pass or set this parameter.
warn(
664/1000 datapoints remaining after outlier removal
Truncated n_evaluations: 40
Segmentation fault (core dumped)`

AttributeError: Can't pickle local object 'AutoCluster.fit.<locals>.evaluate_model'

Using the AutoCluster module on Anaconda (Anaconda3-2019.03-Windows-x86_64) with Spyder (5.3.4) and Python (3.11.5) on Windows, I encountered this problem. How could I resolve this issue Please ?

File ~\anaconda3\Lib\site-packages\spyder_kernels\py3compat.py:357 in compat_exec
exec(code, globals, locals)

File c:********** result_dict = cluster.fit(**fit_params)

File ~\anaconda3\Lib\site-packages\autocluster\autocluster.py:305 in fit
optimal_config = self._smac_obj.optimize()

File ~\anaconda3\Lib\site-packages\smac\facade\smac_facade.py:400 in optimize
incumbent = self.solver.run()

File ~\anaconda3\Lib\site-packages\smac\optimizer\smbo.py:165 in run
self.start()

File ~\anaconda3\Lib\site-packages\smac\optimizer\smbo.py:138 in start
self.incumbent = self.initial_design.run()

File ~\anaconda3\Lib\site-packages\smac\initial_design\single_config_initial_design.py:80 in run
status, cost, runtime, additional_info = self.tae_runner.start(

File ~\anaconda3\Lib\site-packages\smac\tae\execute_ta_run.py:164 in start
status, cost, runtime, additional_info = self.run(config=config,

File ~\anaconda3\Lib\site-packages\smac\tae\execute_func.py:134 in run
rval = self._call_ta(obj, config, **obj_kwargs)

File ~\anaconda3\Lib\site-packages\smac\tae\execute_func.py:213 in _call_ta
return obj(config, **kwargs)

File ~\anaconda3\Lib\site-packages\pynisher\limit_function_call.py:198 in call
subproc.start()

File ~\anaconda3\Lib\multiprocessing\process.py:121 in start
self._popen = self._Popen(self)

File ~\anaconda3\Lib\multiprocessing\context.py:224 in _Popen
return _default_context.get_context().Process._Popen(process_obj)

File ~\anaconda3\Lib\multiprocessing\context.py:336 in _Popen
return Popen(process_obj)

File ~\anaconda3\Lib\multiprocessing\popen_spawn_win32.py:94 in init
reduction.dump(process_obj, to_child)

File ~\anaconda3\Lib\multiprocessing\reduction.py:61 in dump
ForkingPickler(file, protocol).dump(obj)

AttributeError: Can't pickle local object 'AutoCluster.fit..evaluate_model'

Installation problems? I am maintaining a version of this until the authors come back

This is the first google result that comes up when I searched for "python automl clustering" and is frankly a really great library. However, it's not maintained and installation has broken.

See

https://github.com/renxida/autocluster

for a version that works as of Jan 23 2023.

I have also submitted pull requests in the hope that the author comes back, and will gladly close this issue if this repo gets some love.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.