A python library for distributed fiber optic sensing.
Documentation [stable, development]
A python library for distributed fiber optic sensing
License: Other
A python library for distributed fiber optic sensing.
Documentation [stable, development]
Description
While discussing quarto for generating API docs, it was reported here that dascore doesn't install on M1 macs. Potential some problem with pytables/HDF5.
Does anyone out there have an M1 mac we can run tests on? I would like to get this figured out soon as I think many of our users will be using M1s.
When working with really fresh data set collected by the Colorado School of Mines Terra15 interrogator, we noticed calling spool.update
will create duplicates in the spool's index.
I tracked this down to dascore.utils.misc.iter_files
. The mtime in the recorded das files is greater than the timestamp returned by time.time
, which is odd since, AFAIK, both should be Unix timestamps (seconds from UTC 1970-01-01).
For example:
import time
from pathlib import Path
import dascore as dc
path = Path("DASRCN_whale_velocity_UTC-YMD20230531-HMS172443.512_seq_00000000000.hdf5")
print(path.stat().st_mtime) # 1685607120.0
print(time.time()) # 1685587714.1883616
So, the indexer is working because it is supposed to re-index any files with m_times after the last time the indexer was run. Since this files m_time is several hours in the future, it will continue to be added to the index each time update
is called.
A few things we need to figure out:
Is m_time really based on the system time and not UTC? That doesn't really make sense to me and is contrary to the answer in this SO post.
Is there something wrong with the mtimes created by the interrogator? The difference between atime, ctime, and mtime is suspicious since the files should be created and finalized within a few minutes.
st_atime=1685586990
st_mtime=1685607120
st_ctime=1685586057
Note that st_mtime - st_atime ~ 6 hours, the difference between UTC and Colorado's time zone.
Perhaps DASCore should check that m_time is <= current time and if not set m_time to current time? This could slow down and complicated indexing though.
Description
When plotting a Patch with one trace using the wiggle plot an IndexError
is raised.
To Reproduce
import dascore as dc
patch = dc.examples.sin_wave_patch(
sample_rate=1000,
frequency=10,
channel_count=1,
)
patch.viz.wiggle(show=True)
Expected behavior
The wiggle plot should still plot.
@jinwar found out today, during a DASCore presentation, that velocity_to_strain_rate
updates the data_type, but does not update the associated units. It will be best to fix this in thepatch_refactor
branch which implements more support for units.
Description
It isn't as clear as it should be how to add docstring examples.
Description
The cross references currently don't work (e.g., spool's doc page).
I think this is because the back ticks are replaced with "%60" but never replaced again with "`", should be an easy fix but may need to modify the regex in the build docs script.
Currently the indexing mechanism for the DirectorySpool
requires all values in "time_max", "time_min", "d_time", "distance_max", "distance_min", "d_distance" to be non-null values. However, this doesn't need to be the case, and its conceivable some patches may not have even sampling in both time/distance, or may have a completely different second dimension name "eg channel_number". We need to make sure the indexing mechanism is flexible enough to handle these cases.
any other datetime64 precision will create an error due to the to_number method.
After installing dascore with conda dascore.__version__
prints "0.0.0". However, a conda list
does show the correct version (0.0.9). We need to figure out why.
Description
The goal is to create time series from the recorded data during multiphase testing. The code is set to run and read each second of the DAS data and calculate some parameters. At some point in the calculation, it stops and doesn't complete going through the whole data.
To Reproduce
bgtime = np.datetime64('2022-05-12 16:04:45')
dt = np.timedelta64(1,'s')
freq_bands = [[1,10],[10,100],[100,500],[500,1000],[1000,5000]]
FBEs = [[] for i in freq_bands] #to create a list of a list
timestamps = []
current_time = bgtime
for i in range(14400):
gjsignal.print_progress(i)
try:
data1 = spool.select(time=(current_time,current_time+dt))
dataa = dascore.utils.patch.merge_patches(data1)[0]
DASdata1 = Data2D_XT.Patch_to_Data2D(dataa)
DASdata1.apply_gauge_length(3) # apply a gauge length 3x channel spacing
DASdata1.select_depth(50,100)
f,amp = spectrum_analysis(DASdata1)
for ifreq in range(len(freq_bands)):
FBEs[ifreq].append(get_FBE(f,amp,freq_bands[ifreq][0],freq_bands[ifreq][1]))
timestamps.append(current_time)
except:
print('there is an error')
current_time += dt
Expected behavior
It is expected to go through the whole data set and create the time series over the four hours of recorded data.
Versions (please complete the following information):
We have a terra15 strain-rate file which only has the posix_time rather than the gps_time array. It shouldn't be hard, but we need to add logic to the parser to use posix_time if gps_time doesn't exist so this file can be read.
Hello,
I wonder if I can save the index file to a specified location, instead of in the folder. I'm working on a cluster and writing to the data folder needs more permission.
Thank you!
Rosie
There is some inconsistency in units when specifying a range with a dimension name. For example,
import dascore as dc
patch = dc.get_example_patch()
filtered = patch.pass_filter(time=(t1, t2))
filtered - patch.pass_filter(distance=(d1, d2))
Here, should t1 be in Hz or seconds? Should d1 be in m (wavelength) or 1/m (wave number)? Is this consistent with select?
e.g.,
sub_patch = patch.select(time=(t1, t2))
I propose the following rules for using dimension name to specify inputs to functions:
so,
filtered_1 = patch.pass_filter(time=(1, 10)) # filters from 1 to 10 seconds
filtered_2 = patch.pass_filter(time_=(1, 10)) # filters from 1 to 10 Hz
Unfortunately this will break some existing codes but I think having the consistency is worth it. We are still in the 0.0.x version range do still have a big warning that things are rapidly changing after all ;)
GH now supports deploying to GH-pages with a simple GH action. This is a reminder to switch over to it to (hopefully) simplify doc builds.
Example is here: https://github.com/actions/starter-workflows/blob/main/pages/static.yml
Originally posted by jinwar November 1, 2022
Current error messages for fetching a time section without data are not very clear.
import dascore as dc
sp = dc.spool("path/that/doesnt/exist.hdf5")
raises an error that isn't helpful. Something like "couldn't get spool from path/that/doesnt/exist". A more informative error could be raised, such as "path doesn't exist", or, to not assume the input to spool is always a path (it could be a URL in the future), perhaps we just append "if it is a path, it doesn't exist" or something along those lines.
When using spool.select
to trim a dimension, the select...
string which should be added to the history attribute is spaced out such that each letter is an entry.
spool = (
dc.get_example_spool("diverse_das")
.select(distance=(100, 200))
)
print(spool[0].attrs.history)
# 's','e','l', ....
The select string should be a single entry.
Originally posted by d-chambers November 11, 2022
I am checking out the new version of github projects. It looks nice. It supports both board and spreadsheet style views, custom fields, etc.
https://github.com/orgs/DASDAE/projects/2
I am in favor of switching over and archiving the old project board. Thoughts? ( I think @eileenrmartin is the only other person adding things to the old project board)
Originally posted by xunen63 March 30, 2023
When I try to read a hdf5 file with dascore like this
"
import dascore as dc
file_path = 'E:\try_PubDAS\FORESEE\FORESEE_UTC_20190404_194804.hdf5'
spool= dc.spool(file_path)
"
I got:
Traceback (most recent call last):
File "E:\try_PubDAS\DAS_tools.py", line 9, in
spool= dc.spool(file_path)
File "D:\softwares\anaconda\envs\pytorch19\lib\functools.py", line 888, in wrapper
return dispatch(args[0].class)(*args, **kw)
File "D:\softwares\anaconda\envs\pytorch19\lib\site-packages\dascore\core\spool.py", line 327, in spool_from_str
_format, _version = dc.get_format(path)
File "D:\softwares\anaconda\envs\pytorch19\lib\site-packages\dascore\io\core.py", line 506, in get_format
raise UnknownFiberFormat(msg)
dascore.exceptions.UnknownFiberFormat: Could not determine file format of E:\try_PubDAS\FORESEE\FORESEE_UTC_20190404_194804.hdf5
Currently, negative indexing doesn't work on spools, but it should.
import dascore as dc
spool = dc.get_example_spool("random_das")
last_patch = spool[-1] # raises an Error
Negative indexes should work exactly how the do for other python sequences.
When using dc.spool(path)[0]
, it returns
FileNotFoundError: [Errno 2] No such file or directory: '/Users/rosie/Documents/aceffl/DAS/01_Raw/DAS_P10/SM_BriscoeC3339H_CF_P10_UTC_20211224_000909.913.tdms'
which is a file I deleted... the full error message is below:
`---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Input In [10], in <cell line: 5>()
1 # path = fileloc
2
3 # pa = dc.read(path)[0]
4 # or
----> 5 pa = dc.spool(fileloc)[0]
6 pa
File /opt/anaconda3/lib/python3.9/site-packages/dascore/core/spool.py:123, in DataFrameSpool.getitem(self, item)
122 def getitem(self, item):
--> 123 out = self._get_patches_from_index(item)
124 # a single index was used, should return a single patch
125 if not isinstance(item, slice):
File /opt/anaconda3/lib/python3.9/site-packages/dascore/core/spool.py:158, in DataFrameSpool._get_patches_from_index(self, df_ind)
156 raise IndexError(msg)
157 joined = df1.join(source.drop(columns=df1.columns, errors="ignore"))
--> 158 return self._patch_from_instruction_df(joined)
File /opt/anaconda3/lib/python3.9/site-packages/dascore/core/spool.py:168, in DataFrameSpool._patch_from_instruction_df(self, joined)
165 for patch_kwargs in df_dict_list:
166 # convert kwargs to format understood by parser/patch.select
167 kwargs = _convert_min_max_in_kwargs(patch_kwargs, joined)
--> 168 patch = self._load_patch(kwargs)
169 select_kwargs = {
170 i: v for i, v in kwargs.items() if i in patch.dims or i in patch.coords
171 }
172 out_list.append(patch.select(**select_kwargs))
File /opt/anaconda3/lib/python3.9/site-packages/dascore/clients/dirspool.py:124, in DirectorySpool._load_patch(self, kwargs)
122 def _load_patch(self, kwargs) -> Self:
123 """Given a row from the managed dataframe, return a patch."""
--> 124 patch = dc.read(**kwargs)[0]
125 return patch
File /opt/anaconda3/lib/python3.9/site-packages/dascore/io/core.py:342, in read(path, file_format, file_version, time, distance, **kwargs)
336 file_format, file_version = get_format(
337 path,
338 file_format=file_format,
339 file_version=file_version,
340 )
341 formatter = FiberIO.manager.get_fiberio(file_format, file_version)
--> 342 return formatter.read(
343 path, file_version=file_version, time=time, distance=distance, **kwargs
344 )
File /opt/anaconda3/lib/python3.9/site-packages/dascore/io/tdms/core.py:79, in TDMSFormatterV4713.read(self, path, time, distance, **kwargs)
67 def read(
68 self,
69 path: Union[str, Path],
(...)
72 **kwargs
73 ) -> dc.BaseSpool:
74 """
75 Read a silixa tdms file, return a DataArray.
76
77 """
---> 79 with open(path, "rb") as tdms_file:
80 # get time array. If an input isn't provided for time we return everything
81 if time is None:
82 time = (None, None)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/rosie/Documents/aceffl/DAS/01_Raw/DAS_P10/SM_BriscoeC3339H_CF_P10_UTC_20211224_000909.913.tdms'`
After installing quarto, I had an error when trying to build the api docs.
OS: Mac 11.6.5
(dascore) eileenmartin@csm-wl-dhcp-197-90 dascore % where quarto
/usr/local/bin/quarto
(dascore) eileenmartin@csm-wl-dhcp-197-90 dascore % ls
dascore environment.yml scripts
dascore.egg-info pyproject.toml setup.cfg
docs readme.md tests
(dascore) eileenmartin@csm-wl-dhcp-197-90 dascore % cd scripts
(dascore) eileenmartin@csm-wl-dhcp-197-90 scripts % ls
_index_api.py _render_api.py _templates build_api_docs.py
(dascore) eileenmartin@csm-wl-dhcp-197-90 scripts % python build_api_docs.py
Traceback (most recent call last):
File "/Users/eileenmartin/dascore/scripts/build_api_docs.py", line 10, in
data_dict = parse_project(dascore)
File "/Users/eileenmartin/dascore/scripts/_index_api.py", line 196, in parse_project
traverse(obj, data_dict, base_path)
File "/Users/eileenmartin/dascore/scripts/_index_api.py", line 171, in traverse
traverse(mod, data_dict, base_path)
File "/Users/eileenmartin/dascore/scripts/_index_api.py", line 171, in traverse
traverse(mod, data_dict, base_path)
File "/Users/eileenmartin/dascore/scripts/_index_api.py", line 169, in traverse
data_dict[obj_id] = get_data(obj, key, base_path, parent_is_class)
File "/Users/eileenmartin/dascore/scripts/_index_api.py", line 136, in get_data
data = extract_data(obj, parent_is_class)
File "/Users/eileenmartin/dascore/scripts/_index_api.py", line 111, in extract_data
data["short_description"] = docstr.split("\n")[0]
AttributeError: 'NoneType' object has no attribute 'split'
I tested the ProdML file format in the tests/test_io/test_prodml directory using pytest. The test_prodml_v2_0 and test_prodml_v2_1 passed, but the test_prod_ml did not pass.
Below I added the errors regarding testing this on the example data and my Silixa iDAS dataset.
E dascore.exceptions.UnknownFiberFormat: Could not determine file format of /home/ahmadtourei/.cache/dascore/0.0.0/iDAS005_hdf5_example.626.h5
dascore/io/core.py:504: UnknownFiberFormat
================================================ short test summary info =================================================
ERROR tests/test_io/test_prodml/test_prod_ml.py::TestSilixaFile::test_read_silixa - dascore.exceptions.UnknownFiberFormat: Could not determine file format of /home/ahmadtourei/.cache/dascore/0.0.0/iDAS...
E IndexError: index of [0] is out of bounds for spool.
dascore/core/spool.py:158: IndexError
==================================================== short test summary info ====================================================
ERROR tests/test_io/test_prodml/test_prod_ml.py::TestSilixaFile::test_read_silixa - IndexError: index of [0] is out of bounds for spool.
I can see "GaugeLength" in the attributes list using HDFView software. However, I can't get the value using: patch_0.attrs['gauge_length']
Please note that I can get some other attributes such as sampling interval or channel spacing.
Data format: ONYX - PRODML v. 2.0
import dascore as dc
sp = dc.spool(data_path)
patch_0 = sp[0]
gauge_length = patch_0.attrs['gauge_length']
AttributeError Traceback (most recent call last)
Cell In[3], line 12
8 print(patch_0.attrs)
11 # get sampling rate, channel spacing, and gauge length
---> 12 gauge_length = patch_0.attrs['gauge_length']
13 print("Gauge length = ", gauge_length)
14 channel_spacing = patch_0.attrs['d_distance']
File [~/anaconda3/envs/py10/lib/python3.10/site-packages/dascore/core/schema.py:94](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/fervo/iDAS_stimulation_stage8/LF_DAS/~/anaconda3/envs/py10/lib/python3.10/site-packages/dascore/core/schema.py:94), in PatchAttrs.__getitem__(self, item)
93 def __getitem__(self, item):
---> 94 return getattr(self, item)
AttributeError: 'PatchAttrs' object has no attribute 'gauge_length'
We need to enable the generation of a quarto site map so that Google will index our doc pages. This will help, for example, when one googles dascore patch detrend so the correct page will be suggested.
See the here for how to do this.
I upgraded dascore to the most recent version and now have importing issue:
ImportError Traceback (most recent call last)
/tmp/ipykernel_312/2507837442.py in
----> 1 import dascore as dc
~/anaconda3/lib/python3.9/site-packages/dascore/init.py in
3 from xarray import set_options
4
----> 5 from dascore.core.patch import Patch
6 from dascore.core.schema import PatchAttrs
7 from dascore.core.spool import BaseSpool, spool
~/anaconda3/lib/python3.9/site-packages/dascore/core/init.py in
2 Core routines and functionality for processing distributed fiber data.
3 """
----> 4 from .patch import Patch # noqa
~/anaconda3/lib/python3.9/site-packages/dascore/core/patch.py in
14 from dascore.constants import PatchType
15 from dascore.core.schema import PatchAttrs
---> 16 from dascore.io import PatchIO
17 from dascore.transform import TransformPatchNameSpace
18 from dascore.utils.coords import Coords, assign_coords
~/anaconda3/lib/python3.9/site-packages/dascore/io/init.py in
2 Modules for reading and writing fiber data.
3 """
----> 4 from dascore.io.core import write
5 from dascore.utils.misc import MethodNameSpace
6
~/anaconda3/lib/python3.9/site-packages/dascore/io/core.py in
25 )
26 from dascore.utils.docs import compose_docstring
---> 27 from dascore.utils.hdf5 import HDF5ExtError
28 from dascore.utils.misc import suppress_warnings
29 from dascore.utils.patch import scan_patches
~/anaconda3/lib/python3.9/site-packages/dascore/utils/hdf5.py in
14 import numpy as np
15 import pandas as pd
---> 16 import tables
17 from packaging.version import parse as get_version
18 from tables import ClosedNodeError
~/anaconda3/lib/python3.9/site-packages/tables/init.py in
22
23 # Necessary imports to get versions stored on the cython extension
---> 24 from .utilsextension import (
25 get_pytables_version, get_hdf5_version, blosc_compressor_list,
26 blosc_compcode_to_compname_ as blosc_compcode_to_compname,
tables/utilsextension.pyx in init tables.utilsextension()
ImportError: cannot import name typeDict
Description
Slack message from Jin:
There seems to be some bugs with decimate
patch1:
["select(copy=False,time=(numpy.datetime64('2015-11-09T21:17:00'), numpy.datetime64('2015-11-09T21:18:10')))",
"decimate(copy=True,dim='time',factor=4,lowpass=True)",
"decimate(copy=True,dim='time',factor=5,lowpass=True)",
"decimate(copy=True,dim='time',factor=10,lowpass=True)",
"decimate(copy=True,dim='time',factor=10,lowpass=True)"]
patch2:
["select(copy=False,time=(numpy.datetime64('2015-11-09T21:17:00'), numpy.datetime64('2015-11-09T21:18:10')))",
'pass_filter(corners=4,time=(None, 0.5),zerophase=True)',
"decimate(copy=True,dim='time',factor=2000,lowpass=False)"]
patch3:
data = p[0].data.copy()
lpdata = gjsignal.lpfilter(data,0.0005,0.5,axis=1)[:,::2000]
plt.figure()
plt.plot(dsp.data[200,:],label='patch.decimate')
plt.plot(dsp2.data[200,:],label='patch.pass_filter.decimate(lowpass=False)')
plt.plot(lpdata[200,:],label='gjsignal.lpfilter')
plt.legend()
The time axis label in the waterfall plot should just read "time" not "time(s)". This should also work for other dimensions not named "time" but are datetime64 types.
We need a Patch.dropna
function, which should simply be based on pandas.dropna or, easier yet, xarray's dropna
We need to fix a few things in the Patch documentation
I saw a great blogpost about how to improve github actions for python projects.
We should adopt these practices, particularly implement caching for downloaded test files.
While working on implementing IO support we found that a FiberIO
that didn't have a get_format
method implemented didn't work with dc.read(path, format_name)
because the get_format
logic was still being called. We need to look into why this is and make sure when the file_format is specified get_format is not needed.
Currently the unit display is a bit overly verbose. For example, "meter" rather than "m" is displayed in plots and such. This SO post shows how to configure pint to be more concise.
Currently the following raises an error:
import dascore as dc
patch = dc.load_example_patch()
amp = patch.tran.rfft().abs()
because the Patch.new
method can mix up the dimensions when the coordinate dict is out of order.
Currently the order of the coordinate dictionary can affect the expected dimensions. This means when passing the combination of data, coord to Patch.new
it can work if the order of the coord dict is right, otherwise it will raise.
This is due to this line in Patch.new
.
import dascore as dc
patch = dc.get_example_patch()
axis = patch.dims.index("time")
data = np.std(patch.data, axis=axis, keepdims=True)
new_time = patch.coords["time"][0:1]
new_dist = patch.coords["distance"]
coords_1 = {"time": new_time, "distance": new_dist}
coords_2 = {"distance": new_dist, "time": new_time}
# One of these works, the other doesnt
out_1 = patch.new(data=data, coords=coords_1)
out_2 = patch.new(data=data, coords=coords_2)
The coord dict order shouldn't affect the patch construction.
the procedure does not work anymore
https://dasdae.github.io/dascore/markdown/contributing/dev_install.html
Error: File "setup.py" not found.
Description
I think to_timedelta64
should return a negative timedelta64 when provided with a negative input, but currently it doesn't.
import dascore as dc
assert dc.to_timedelta64(-0.1) == -dc.to_timedelta64(0.1) # currently fails
Currently, velocity_to_strain_rate
uses the order
parameter of findiff incorrectly. Although the docs describe that parameter as the order of the stencil for calculaing the first derivative, it is actually the order of the derivative (e.g, 2 means the second derivative, not a stencil with accuracy of 2 cells).
# patch from terra15 data
ok = patch.tran.velocity_to_strain_rate() # works fine since default order is 1
wrong = patch.tran.velocity_to_strain_rate(order=4) # actually 4th derivative
Currently, chunk
doesn't work with a file spool because FileSpool
doesn't accept an instance of itself as an input argument for its __init__
method, which is expected by the DataFrameSpool.chunk
which FileSpool
inherits from.
file_path = fetch('terra15_das_1_trimmed.hdf5')
spool = dc.spool(file_path)
# this raises
spool.chunk(time=.01)
It should "just work".
DASCore currently doesn't support python datetime
or timedelta
objects. It should.
from datetime import datetime
import numpy as np
import dascore as dc
py_dt = datetime.now()
py_td = datetime.now() - py_dt
# these both raise an error
dc.to_datetime64(py_dt)
dc.to_timedelta64(py_td)
# they should also work with lists
dc.to_datetime64([py_dt] * 10)
dc.to_timedelta64([py_td] * 10)
# and arrays
dc.to_datetime64(np.array([py_dt] * 10))
dc.to_timedelta64(np.array([py_td] * 10))
I have a list of patches, and I want to merge them together to create one patch using the chunk() method. I am using spool_from_patch_list to create a spool from the list of patches and then use chunk() to merge them:
new_spool = spool_from_patch_list(plist)
p = spool(new_spool).chunk(time=None)
However, since the new_spool's type is 'dascore.core.spool.MemorySpool' (instead of 'dascore.clients.dirspool.DirectorySpool',) I'm getting the following error:
ValueError Traceback (most recent call last)
Cell In[5], line 3
1 import imp
2 imp.reload(lfproc)
----> 3 plist = lfproc.gather_results(output_folder)
File [~/coding/lfproc.py:156](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/~/coding/lfproc.py:156), in gather_results(folder)
--> 156 return spool(new_spool).chunk(time=None)
File [~/coding/dascore/dascore/core/spool.py:234](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/~/coding/dascore/dascore/core/spool.py:234), in DataFrameSpool.chunk(self, overlap, keep_partial, snap_coords, tolerance, **kwargs)
226 df = self._df.drop(columns=list(self._drop_columns), errors="ignore")
227 chunker = ChunkManager(
228 overlap=overlap,
229 keep_partial=keep_partial,
(...)
232 **kwargs,
233 )
--> 234 in_df, out_df = chunker.chunk(df)
235 if df.empty:
236 instructions = None
File [~/coding/dascore/dascore/utils/chunk.py:309](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/~/coding/dascore/dascore/utils/chunk.py:309), in ChunkManager.chunk(self, df)
300 new_start_stop = get_intervals(
301 start,
302 stop,
(...)
306 keep_partials=self._keep_partials,
307 )
308 # create the newly chunked dataframe
--> 309 sub_new_df = self._create_df(current_df, self._name, new_start_stop, gnum)
310 out.append(sub_new_df)
312 out = pd.concat(out, axis=0).reset_index(drop=True)
File [~/coding/dascore/dascore/utils/chunk.py:199](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/~/coding/dascore/dascore/utils/chunk.py:199), in ChunkManager._create_df(self, df, name, start_stop, gnum)
197 vals = merger[col].unique()
198 assert len(vals) == 1, "Haven't yet implemented non-homogenous merging"
--> 199 out[col] = vals[0]
200 # add the group number for getting instruction df later
201 out["_group"] = gnum
File [~/anaconda3/envs/dascore/lib/python3.11/site-packages/pandas/core/frame.py:3950](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/~/anaconda3/envs/dascore/lib/python3.11/site-packages/pandas/core/frame.py:3950), in DataFrame.__setitem__(self, key, value)
3947 self._setitem_array([key], value)
3948 else:
3949 # set column
-> 3950 self._set_item(key, value)
File [~/anaconda3/envs/dascore/lib/python3.11/site-packages/pandas/core/frame.py:4143](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/~/anaconda3/envs/dascore/lib/python3.11/site-packages/pandas/core/frame.py:4143), in DataFrame._set_item(self, key, value)
4133 def _set_item(self, key, value) -> None:
4134 """
4135 Add series to DataFrame in specified column.
4136
(...)
4141 ensure homogeneity.
4142 """
-> 4143 value = self._sanitize_column(value)
4145 if (
4146 key in self.columns
4147 and value.ndim == 1
4148 and not is_extension_array_dtype(value)
4149 ):
4150 # broadcast across multiple columns if necessary
4151 if not self.columns.is_unique or isinstance(self.columns, MultiIndex):
File [~/anaconda3/envs/dascore/lib/python3.11/site-packages/pandas/core/frame.py:4870](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/~/anaconda3/envs/dascore/lib/python3.11/site-packages/pandas/core/frame.py:4870), in DataFrame._sanitize_column(self, value)
4867 return _reindex_for_setitem(Series(value), self.index)
4869 if is_list_like(value):
-> 4870 com.require_length_match(value, self.index)
4871 return sanitize_array(value, self.index, copy=True, allow_2d=True)
File [~/anaconda3/envs/dascore/lib/python3.11/site-packages/pandas/core/common.py:576](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/ahmadtourei/coding/~/anaconda3/envs/dascore/lib/python3.11/site-packages/pandas/core/common.py:576), in require_length_match(data, index)
572 """
573 Check the length of data matches the length of the index.
574 """
575 if len(data) != len(index):
--> 576 raise ValueError(
577 "Length of values "
578 f"({len(data)}) "
579 "does not match length of index "
580 f"({len(index)})"
581 )
ValueError: Length of values (2) does not match length of index (1)
Also, is there any other way to merge patches from a list of patches?
Thanks in advance!
We need to restrict the distribution contents to only the necessary files/folders. For example, we shouldn't include the docs/.github directories. This can be done with a simple manifest.ini file. Here is an example.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.