GithubHelp home page GithubHelp logo

wolph / portalocker Goto Github PK

View Code? Open in Web Editor NEW
253.0 10.0 47.0 415 KB

An easy library for Python file locking. It works on Windows, Linux, BSD and Unix systems and can even perform distributed locking. Naturally it also supports the with statement.

Home Page: http://portalocker.readthedocs.io/en/latest/

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
python locking lock distributed

portalocker's Introduction

portalocker - Cross-platform locking library

Linux Test Status

Windows Tests Status

Coverage Status

Overview

Portalocker is a library to provide an easy API to file locking.

An important detail to note is that on Linux and Unix systems the locks are advisory by default. By specifying the -o mand option to the mount command it is possible to enable mandatory file locking on Linux. This is generally not recommended however. For more information about the subject:

The module is currently maintained by Rick van Hattem <[email protected]>. The project resides at https://github.com/WoLpH/portalocker . Bugs and feature requests can be submitted there. Patches are also very welcome.

Security contact information

To report a security vulnerability, please use the Tidelift security contact. Tidelift will coordinate the fix and disclosure.

Redis Locks

This library now features a lock based on Redis which allows for locks across multiple threads, processes and even distributed locks across multiple computers.

It is an extremely reliable Redis lock that is based on pubsub.

As opposed to most Redis locking systems based on key/value pairs, this locking method is based on the pubsub system. The big advantage is that if the connection gets killed due to network issues, crashing processes or otherwise, it will still immediately unlock instead of waiting for a lock timeout.

First make sure you have everything installed correctly:

pip install "portalocker[redis]"

Usage is really easy:

import portalocker

lock = portalocker.RedisLock('some_lock_channel_name')

with lock:
    print('do something here')

The API is essentially identical to the other Lock classes so in addition to the with statement you can also use lock.acquire(...).

Python 2

Python 2 was supported in versions before Portalocker 2.0. If you are still using Python 2, you can run this to install:

pip install "portalocker<2"

Tips

On some networked filesystems it might be needed to force a os.fsync() before closing the file so it's actually written before another client reads the file. Effectively this comes down to:

with portalocker.Lock('some_file', 'rb+', timeout=60) as fh:
    # do what you need to do
    ...

    # flush and sync to filesystem
    fh.flush()
    os.fsync(fh.fileno())

Examples

To make sure your cache generation scripts don't race, use the Lock class:

>>> import portalocker >>> with portalocker.Lock('somefile', timeout=1) as fh: ... print('writing some stuff to my cache...', file=fh)

To customize the opening and locking a manual approach is also possible:

>>> import portalocker >>> file = open('somefile', 'r+') >>> portalocker.lock(file, portalocker.LockFlags.EXCLUSIVE) >>> file.seek(12) >>> file.write('foo') >>> file.close()

Explicitly unlocking is not needed in most cases but omitting it has been known to cause issues: AzureAD/microsoft-authentication-extensions-for-python#42 (comment)

If needed, it can be done through:

>>> portalocker.unlock(file)

Do note that your data might still be in a buffer so it is possible that your data is not available until you flush() or close().

To create a cross platform bounded semaphore across multiple processes you can use the BoundedSemaphore class which functions somewhat similar to `threading.BoundedSemaphore`:

>>> import portalocker >>> n = 2 >>> timeout = 0.1

>>> semaphore_a = portalocker.BoundedSemaphore(n, timeout=timeout) >>> semaphore_b = portalocker.BoundedSemaphore(n, timeout=timeout) >>> semaphore_c = portalocker.BoundedSemaphore(n, timeout=timeout)

>>> semaphore_a.acquire() <portalocker.utils.Lock object at ...> >>> semaphore_b.acquire() <portalocker.utils.Lock object at ...> >>> semaphore_c.acquire() Traceback (most recent call last): ... portalocker.exceptions.AlreadyLocked

More examples can be found in the tests.

Versioning

This library follows Semantic Versioning.

Changelog

Every release has a git tag with a commit message for the tag explaining what was added and/or changed. The list of tags/releases including the commit messages can be found here: https://github.com/WoLpH/portalocker/releases

License

See the LICENSE file.

portalocker's People

Contributors

ahauan4 avatar bwbeach avatar flaviens avatar hugovk avatar ignatenkobrain avatar jonringer avatar joshuacwnewton avatar kianmeng avatar lkindrat-xmos avatar lukemurphey avatar mgorny avatar muhrin avatar naggie avatar rayluo avatar shadchin avatar techtonik avatar wolph avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

portalocker's Issues

How to remove the .lock file after the unlock?

Hi,

I'm currently using this library in a Python application.

It is working quite good, but although it works, it does not remove the .lock file after the unlock.

There's any way/option to do it? The existence of this file while there are no locking processes is quite confusing.

Thanks.

Better Examples Needed

https://portalocker.readthedocs.io/en/latest/#examples

These examples just show how to lock a file, not what errors or responses are possible. What happens if another process tries to lock a file during the same time? It would be helpful at convincing people to install the module.

Also your second example does not work if the lock file does not exist.

Additionally, there should be an example on how to fail a lock. If I just copy paste your example, I have to do some digging to figure out how to fail immediately. There's also no API reference to check what the available arguments are.

Also is there anyway to update the lock from one flags to another without unlocking?

mode variable set but immediately overridden

Hi @wolph ,

I've been reading the code to figure out how to write a lock checker. Whilst looking at it I found what looks like a mistake:

https://github.com/WoLpH/portalocker/blob/c34a48e5724b3874b7f5722a0192d213c81cf6c0/portalocker/portalocker.py#L51

            mode = win32con.LOCKFILE_EXCLUSIVE_LOCK
            if flags & constants.LOCK_NB:
                mode |= win32con.LOCKFILE_FAIL_IMMEDIATELY

            if flags & constants.LOCK_NB:
                mode = msvcrt.LK_NBLCK
            else:
                mode = msvcrt.LK_LOCK

It seems that mode will always be set to msvcrt.LK_NBLCK or mode = msvcrt.LK_LOCK making the first 3 lines in the snippet redundant.

Are the assignments supposed to set a bit instead of the entire variable? If not, assuming the 2 methods have the same affect, should the first 3 lines just be removed?

Thanks!
Callan

LockException(OSError(22, 'Invalid argument')) when using NON_BLOCKING flag on Linux

OS: Zorin OS (Ubuntu based Linux)

Example

import portalocker
from portalocker.exceptions import LockException

file = open('test.lock', 'w+', encoding='utf-8')
portalocker.lock(file, portalocker.LockFlags.NON_BLOCKING)

Traceback

Traceback (most recent call last):
  File "/home/maste/.local/lib/python3.10/site-packages/portalocker/portalocker.py", line 138, in lock
    fcntl.flock(file_.fileno(), flags)
OSError: [Errno 22] Invalid argument

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/maste/Documents/GitHub/music-caster/src/test.py", line 5, in <module>
    portalocker.lock(file, portalocker.LockFlags.NON_BLOCKING)
  File "/home/maste/.local/lib/python3.10/site-packages/portalocker/portalocker.py", line 142, in lock
    raise exceptions.LockException(exc_value, fh=file_)
portalocker.exceptions.LockException: [Errno 22] Invalid argument

unsupported or invalid wheel

When I try to install the version 0.5.7 from PyPi with pip 8.0.3 on Python 2.7 the installation of portalocker fails with the message portalocker is in an unsupported or invalid wheel.
This also happens on a travis-instance running Python 2.7. Installation on a Python 3.4 instance however works without problem.
I suspect, without further inspection, that the wheel that you provide is not compatible with 2.7.

Windows shared lock with Python 3

In portalocker.py for SHARED locks the code has a sys.major version conditional that uses msvcrt flags on Python 3. The problem is that win32file.LockFileEx is being used so the flags don't match, and a non-blocking flag will block instead. Just remove the "if sys.version" and always use the win32con flags since msvcrt isn't used for SHARED.

Read lock danger

I was trying to make a read lock. I spotted from the docs that the default mode is append, which is a bit dangerous already, so I set the mode to read.

In [208]: lock = portalocker.Lock(pathname, mode='r')

In [209]: lock.acquire()
---------------------------------------------------------------------------
IOError                                   Traceback (most recent call last)
<ipython-input-209-394e97d7f4d5> in <module>()
----> 1 lock.acquire()

/home/pforman/workspace/ipy27/lib/python2.7/site-packages/portalocker/utils.pyc in acquire(self, timeout, check_interval, fail_when_locked)
    105
    106         # Prepare the filehandle (truncate if needed)
--> 107         fh = self._prepare_fh(fh)
    108
    109         self.fh = fh

/home/pforman/workspace/ipy27/lib/python2.7/site-packages/portalocker/utils.pyc in _prepare_fh(self, fh, truncate)
    134         if truncate is not None:
    135             fh.seek(truncate)
--> 136             fh.truncate(truncate)
    137
    138         return fh

IOError: File not open for writing

Scary stuff. I see now that I also need to set truncate=None.

My problem would have been avoided by changing the default in Lock.__init__() to truncate=None.

The default Lock is effectively mode='w' but that is not obvious from mode='a', truncate=0. Indeed it will be surprising when Lock(pathname, mode='a') truncates your file and discards the contents you thought you were appending to.

I suggest that the default for mode be set to 'r' and truncate and flags default to None. The mode argument should then drive other defaulting, e.g. flags is None and 'r' in mode and not '+' in mode => self.flags = LOCK_SH | LOCK_NB. You might represent 'w' in the mode argument as 'W' in self.mode to maintain the truncate only after successful lock action; there are probably better ways of achieving that result.

Further to that I do not think there should be any truncate variables. The work done in _prepare_fh can be simplified to fh.seek(0); fh.truncate() if write mode was specified in __init__. Let the user do their own seek if they want do something other than overwrite or append.

is lock reenterable?

with portalocker.Lock("temp", flags=portalocker.LOCK_EX) as fd:
print("locked")
with portalocker.Lock("temp", flags=portalocker.LOCK_EX):
print("reentrant")
input()
i made a test, found portalocker is not reenterable. Does portalocker support reentrant?

Sorry for OT

Hello @wolph
Since I can't write to you through your website either, I'll try here. @RickDB is also not available.
The forum Hyperion-Project.org is down since yesterday. Could you look there possibly times. Thank you very much for your help.

"portalocker is in an unsupported or invalid wheel"

$ pip --no-cache-dir install portalocker
Collecting portalocker
Downloading portalocker-0.6.0-py2.py3-none-any.whl
Installing collected packages: portalocker
portalocker is in an unsupported or invalid wheel
$ uname -a
Linux martin-desktop 4.4.0-34-generic #53~14.04.1-Ubuntu SMP Wed Jul 27 16:56:40 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
$

0.6.0 seems to have been uploaded today so perhaps this is a bad release?

Allow to specify lock method while creating lock

When using the first method mentioned in the examples here, http://portalocker.readthedocs.org/en/latest/usage.html#examples , the lock method is hard coded to the following in portalocker/utils.py:

LOCK_METHOD = portalocker.LOCK_EX | portalocker.LOCK_NB

Instead of doing it that way, could we specify the lock method in the constructor? Or could it be determined based on whether the file is being used for read or write? This Lock object has the timeout functionality which is useful while opening files for reading(if they are locked for writing). But there is no need for LOCK_EX in 'r' mode.

Please let me know if I missed something.

Does portalocker support python 3.5.2 win?

Windows 7 x64, python 3.5.2

code from example, 'somefile' exists and writeable
file = open('somefile', 'r+') portalocker.lock(file, portalocker.LOCK_EX)

crashed with message
Process finished with exit code -1073740777 (0xC0000417)
stack overflow/stack exhaustion
on line 33 in portalocker.py
msvcrt.locking(file_.fileno(), mode, -1)

But that is not all. Rewrite code to
portalocker.lock(file, portalocker.LOCK_SH)
and script fail with message

File "C:\Python\lib\site-packages\portalocker\portalocker.py", line 14, in lock
mode = msvcrt.LK_RLOCK
AttributeError: module 'msvcrt' has no attribute 'LK_RLOCK'

mcvcrt

redis submodule cannot be imported due to circular import

Hi WoLpH,

thank you for developing and maintaining this package. I recently installed the version 2.3.0 via pip and figured out that the redis submodule did not properly install. It seems that the problem I've encountered can be traced back to line 10 in the redis submodule from redis import client which raises an ImportError due to a circular import.

I tried the earliest version (2.1.0) in which the RedisLock has been implemented and also looked into the different branches of this project but this particular line of code seems to be in all of them, thus I do not know if I can circumvent this issue and still install and import the redis submodule or if I have missed something else.

Best regards,

Christian

Change the error message showed when the resource is locked

Now I'm putting the exclusive code inside the next statement: with portalocker.Lock():.

It works well, but..

Is there any way to change the given message when there's another process using the resource?

Now I'm getting:

[CRITICAL] Unhandled exception on Autosubmit: [Errno 11] Resource temporarily unavailable
Traceback (most recent call last):
...
...
AlreadyLocked: [Errno 11] Resource temporarily unavailable

Which is an ugly error and it is not intuitive for the users. Is it possible to show another message more friendly for the users?

For example: "There is another instance of Autosubmit running this experiment, stop it before continue, please". Or something similar.

Surely you are agree with me that the traceback of all the Python error is useless and unpleasant for the users.

*Autosubmit is the tool that i'm developing.

Many thanks ;)

AttributeError: 'msvcrt' object has no attribute 'LK_RLOCK'

I try to use this package in Windows OS, code is following:

import portalocker
import time

f = open("z:\\test1.txt","r")
portalocker.lock(f,portalocker.LOCK_SH)
f.write("hello, from  windows\n")
f.flush()
time.sleep(300)

f.close()

print "done"

But errors raised:
Traceback (most recent call last):
File "C:/Users/cifsuser10/Desktop/test.py", line 5, in
portalocker.lock(f,portalocker.LOCK_SH)
File "C:\Python27\lib\site-packages\portalocker\portalocker.py", line 14, in lock
mode = msvcrt.LK_RLOCK
AttributeError: 'module' object has no attribute 'LK_RLOCK'

I check the mode msvcrt.

dir(msvcrt)
['CRT_ASSEMBLY_VERSION', 'LIBRARIES_ASSEMBLY_NAME_PREFIX', 'LK_LOCK', 'LK_NBLCK', 'LK_NBRLCK', 'LK_RLCK', 'LK_UNLCK', 'VC_ASSEMBLY_PUBLICKEYTOKEN', 'doc', 'name', 'package', 'get_osfhandle', 'getch', 'getche', 'getwch', 'getwche', 'heapmin', 'kbhit', 'locking', 'open_osfhandle', 'putch', 'putwch', 'setmode', 'ungetch', 'ungetwch']

No 'LK_RLOCK', but similar 'LK_RLCK', so, I think maybe this is a bug(Python 2.7.13, Windows Server 2012)

Python3 support

Hi,

Please port the package to Python3.

Thanks for considering.

DOC: clarify "[portalocker is] an extended version of portalocker [...]"

Nitpick asking for clarification in a couple places. The one-line summary for this repo reads "An extended version of portalocker to lock files in Python using the with statement" and the one-line summary on pypi reads "Wraps the portalocker recipe for easy usage"

These make it sound like there is another package also called portalocker that this is wrapping. can these be re-worded to clarify whether or not there exist two separate projects?

Doesn't work on Ubuntu 14.04, python 2.7.6?

So I did this in one python console (the file already existed)

import portalocker as pl
f1 = open('./asdf', 'w')
pl.lock(f1, pl.LOCK_EX)

Then in another python console I did this

f1 = open('./asdf', 'w')
f1.write('zzzzzzzzzzzzzzzzfdffffffffffffffffffffffffff')
f1.close()

Then I do this in a bash console

$ cat ./asdf
zzzzzzzzzzzzzzzzfdffffffffffffffffffffffffff

Am I missing something about how this is supposed to work, or does it just not lock the file?

Timeout not working with SHARED flag

The timeout doesn't seem to work using the SHARED flag

I tried with this sample code (without the flag):

from pathlib import Path
import os
from time import sleep
import portalocker as pl
from datetime import datetime

DIRECTORY = Path('.')
p_lock = Path(DIRECTORY / '.lock')
print("Start", datetime.now())
with pl.Lock(p_lock, 'r', timeout=5):
    print("Lock acquired", datetime.now())
    for i in range(10):
        print("Sleeping", i)
        sleep(1)

And issuing flock .lock --command "sleep 120" on another terminal, and as you can see on the following output, it raised an error after 5 seconds, just as expected:

$ flock .lock --command "sleep 120" & python sample.py
[1] 1019583
Start 2022-01-27 10:57:05.303649
Traceback (most recent call last):
  File "/home/davo/sample.py", line 10, in <module>
    with pl.Lock(p_lock, 'r', timeout=5):
  File "/home/davo/.local/lib/python3.10/site-packages/portalocker/utils.py", line 153, in __enter__
    return self.acquire()
  File "/home/davo/.local/lib/python3.10/site-packages/portalocker/utils.py", line 254, in acquire
    raise exceptions.LockException(exception)
portalocker.exceptions.LockException: [Errno 11] Resource temporarily unavailable

On the other hand, when I add the SHARED flag, it waits until the file is unlocked again

from pathlib import Path
import os
from time import sleep
import portalocker as pl
from datetime import datetime

DIRECTORY = Path('.')
p_lock = Path(DIRECTORY / '.lock')
print("Start", datetime.now())
with pl.Lock(p_lock, 'r', timeout=5, flags=pl.LockFlags.SHARED):
    print("Lock acquired", datetime.now())
    for i in range(10):
        print("Sleeping", i)
        sleep(1)

As you can see in the output, it waits 15 seconds to access the file, instead of raising an error after 5

$  flock .lock --command "sleep 15" & python sample.py
[1] 1034452
Start 2022-01-27 11:07:03.311756
[1]  + 1034452 done       flock .lock --command "sleep 15"
Lock acquired 2022-01-27 11:07:18.264913
Sleeping 0
Sleeping 1
Sleeping 2
Sleeping 3
Sleeping 4
Sleeping 5
Sleeping 6
Sleeping 7
Sleeping 8
Sleeping 9

Btw, the docs should say that timeout is specified in seconds. I took me a while to detect the error because I thought the timeout was too big

Document locking gotcha's on Linux

From #23 people need to know that Linux doesn't have mandatory locking, validate that and get to know what workarounds portalocker helps in this situation.

DLL load failed while importing win32file

python Version 3.8

[pywin32]==303
[portalocker]==2.4.0

import win32file
ImportError: DLL load failed while importing win32file

I think is pywin32-303 doesn't have win32file.pyd

Do you mind/can you say which file has this `Permission denied` on the trackback?

Traceback (most recent call last):
  File "f:\python\lib\site-packages\portalocker\portalocker.py", line 58, in unlock
    msvcrt.locking(file_.fileno(), constants.LOCK_UN, 0x7fffffff)
PermissionError: [Errno 13] Permission denied

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "F:\Python\lib\logging\__init__.py", line 1940, in shutdown
    h.close()
  File "F:\SublimeText\Data\Packages\concurrentloghandler\all\concurrent_log_handler\__init__.py", line 320, in close
    unlock(self.stream_lock)
  File "f:\python\lib\site-packages\portalocker\portalocker.py", line 61, in unlock
    exceptions.LockException.LOCK_FAILED, exc_value.strerror)
portalocker.exceptions.LockException: (1, 'Permission denied')

Do you mind/can you say which file has this Permission denied on the trackback?

Because I do not know which one it is.

Lock file gets blocked on Windows if put in user home directory and not unlocked explicitly

This is more a question for now, since I'm trying to find what the exact problem is.

I work on Windows 10 and have a problem with a lock file at ~/.myapp/lock.

I used a library called msal-extensions, which relies on portalocker to lock the token cache. The details are not important, here is the code of their lock class:

class CrossPlatLock(object):
    """Offers a mechanism for waiting until another process is finished interacting with a shared
    resource. This is specifically written to interact with a class of the same name in the .NET
    extensions library.
    """
    def __init__(self, lockfile_path):
        self._lockpath = lockfile_path
        self._fh = None

    def __enter__(self):
        pid = os.getpid()

        self._fh = open(self._lockpath, 'wb+', buffering=0)
        portalocker.lock(self._fh, portalocker.LOCK_EX)
        self._fh.write('{} {}'.format(pid, sys.argv[0]).encode('utf-8'))

    def __exit__(self, *args):
        self._fh.close()
        try:
            # Attempt to delete the lockfile. In either of the failure cases enumerated below, it is
            # likely that another process has raced this one and ended up clearing or locking the
            # file for itself.
            os.remove(self._lockpath)
        except OSError as ex:
            if ex.errno != errno.ENOENT and ex.errno != errno.EACCES:
                raise

As you can see, upon exit the file handle is closed and then removed (relying on portalocker to unlock it). However the file (or something) remains in the filesystem and becomes completely blocked by an unknown process (tried processexplorer and handle to figure it out - I can only see that it's blocked by SYSTEM). As a result the consequent calls result in 'Access denied' exception.

I've tried to play with it and found out that:

  • if I remove writing to the file inside __enter__ - it works;
  • if I explicitly call portalocker.unlock before closing the file inside __exit__ - it also works.

Can you provide any help/thoughts on where the problem is? I've submitted a bug to msal-extensions as well, but they also asked me to check with portalocker.

PyWin32 as dependency?

Hi!
Not sure if thats an issue, but I had to install pywin32 on Windows 10 myself, shouldn't it be listed somewhere or shouldn't it be installed in a portalocker installation process as a depedency? (I'm not quite good at writing setup.py files, so can't provide PR, but I beleve this problem can be solved by adding package name somewhere)
Win 10, Python 3.6.4, virtualenv

Declare PEP 561 compliance (type checking support)

Type checkers do not yet know that portalocker now supports type checking.

Type hinting was recently add to this library, as of commit 41de2a0, or version 2.1.0. (Good job on that. I was not expecting it to be added yet!) This means library users can run type checkers on their code against the types defined by the library. A simple usage example is as follows:

#!/usr/bin/env bash

# Make sure the current version has type hinting\.
python3 -m pip install portalocker>=2.1.0
# Create a file to use the library.
file=watchcat.py
echo import portalocker > $file
# Type check it.
mypy $file

Expected and current behaviours

The type checking should pass, since the Python code only imports portalocker. The output I have, on Python 3.9.1, is instead

watchcat.py:1: error: Skipping analyzing 'portalocker': found module but no type hints or library stubs
watchcat.py:1: note: See https://mypy.readthedocs.io/en/latest/running_mypy.html#missing-imports
Found 1 error in 1 file (checked 1 source file)

Possible solution

Once a package has type hinting, it can inform type checkers by including a file name named py.typed in the installed package directory, as specified in PEP 561. (I only found out about this today.) In the case of this project, it is sufficient to

  1. Create the file portalocker/py.typed.
  2. Include this in setup.py: package_data={'portalocker': ['py.typed']},.

The following Python script should achieve this.

#!/usr/bin/env python3

import pathlib

pathlib.Path('portalocker/py.typed').touch()
setup_path = pathlib.Path('setup.py')
with setup_path.open() as setup_file:
    lines = setup_file.readlines()
if not any('package_data' in line for line in lines):
    lines.insert(
        len(lines) - 1,
        "        package_data={'portalocker': ['py.typed']},\n",
    )
    with setup_path.open('w') as setup_file:
        lines = setup_file.writelines(lines)

Once patched and reinstalled, the mypy usage above should pass.

Opening a write-locked file, but can't find contents (when opening a pickled pandas DataFrame)

Hello, I posted a stackoverflow post about this: https://stackoverflow.com/questions/53044203/write-locked-file-sometimes-cant-find-contents-when-opening-a-pickled-pandas-d

Basically, I need to write-lock a .pickle file, read it, add a row, and save it again. This is done by up to 300 hundred simultaneous processes on the same file, they're adding data.

Anyway, the issue I'm getting is that as I increase the number of simultaneous processes, I start to get an error where a process will obtain write-lock, but isn't able to find the contents somehow. So the read fails. Here's my code:

with portalocker.Lock('/path/to/file.pickle', 'rb+', timeout=120) as file:
    file.seek(0)
    df = pd.read_pickle(file)

    # ADD A ROW TO THE DATAFRAME

    # The following part might not be great,
    # I'm trying to remove the old contents of the file first so I overwrite
    # and not append, not sure if this is required or if there's
    # a better way to do this.
    file.seek(0)
    file.truncate()
    df.to_pickle(file)

It fails at pd.read_pickle. I get a convoluted traceback from pandas, and the following error:

EOFError: Ran out of input

The contents are there afterwards (like after all the processes finish, I have no problem reading the DataFrame. Not to mention, some (most) of the processes end up finding the contents and updating the DataFrame without a hitch. But with 300 hundred simultaneous processes, up to 30-40% end up failing.

Since it works sometimes but not all the time, I assumed it must be some problem where the previous process saves the file and exits write-lock, but the contents don't get saved in time or for some reason can't be read if the next process opens up the file too early. Is this possible in any way?

Also, since you're the experts here, perhaps my code above could use improvements, I'd been glad to hear if there's a better way of doing it.

EDIT: Another thing I wanted to ask is; what happens if the write-lock waits for too long but times out? I gave it 120 seconds which seemed enough to me (I estimate on average I have about 2.5 writes per second, for a 300KB pickle file). I tried adding a flag that would trip if the write-lock timed out, but is there a way to make portalocker.Lock return an error if it times out, just to be sure?

Support for other Unixes

On Unix, portalocker uses flock() which isn't part of the POSIX standard and is unavailable on Unix systems like Solaris. I'd like to use portalocker in one of my projects, but we have some users on more obscure Unix systems like Solaris that we don't want to break compatibility with. The alternative to flock() is lockf() or fcntl() which are part of the POSIX standard and have the additional benefit of (more likely) working across NFS better than flock().

Would you be willing to either:

  1. Switch to lockf() or fcntl() over flock() on POSIX systems.
  2. Fall back to lockf() or fcntl() if flock() is unavailable.

ModuleNotFoundError

If I have a file that is:

import portalocker

It gives a ModuleNotFoundError:

No module named 'portalocker'

Distribution of `tests` in top level `site-packages`

Hello,

Based on our troubleshooting, we've found an issue with this package where the tests appear to be inadvertently shipped as part of the source distribution, and are put in the top level python/3.x/site-packages/tests folder. This is a problem for us during our test suite execution, as our imports are also located in a local folder called tests, but due to sys.path precedence, it appears that this package's tests are included as well.

If possible please change the installation of tests to no longer be installed in a top level directory. We are using this package via the msal-extensions package.

Steps to Reproduce

  1. Create a virtualenv: python -m venv env
  2. Activate: source env/bin/activate
  3. Install portalocker from PyPi: pip install portalocker==1.7.0
  4. Navigate to venv/lib/python3.x/site-packages/tests
  5. Observe all tests from portalocker_tests be present.

Passing arguments to underlying open()

Right now, it doesn't seem to be possible to pass argument to file.open() when using the with portalocker.Lock() construct. For instance, I would like to be able to open a csv file with a specific newline argument, like so:

with portalocker.Lock('test.csv', 'a', timeout=1, newline='') as csvfile:
    writer = csv.writer(csvfile)

I imagine that the Lock() constructor could take a **kwargs as the last arguments, which is then passed on to file.open().

Setting the DenyMode against a file

Hi there,

Thanks for such a great library and tool. We had a question regarding the libraries ability to pass trough the DenyMode to an underlying SMB server.

We have an SMB connection, mounted like such:

mount -t cifs -o rw,soft,nolock,iocharset=utf8,file_mode=0600,dir_mode=0700,user='admin',password='***',vers=2.0,mapchars,uid=5000,gid=5000 '//IPADDRESS/sambashare' '/mnt/CIFS/77ae37408e5e9d18c5a04987c/'

With portalocker we are running:

import portalocker
import time

file = open('/mnt/CIFS/77ae37408e5e9d18c5a04987c/output.txt', 'r+')
portalocker.lock(file, portalocker.LOCK_EX)
time.sleep(5)

When we inspect the smbstatus we observe the following:

Locked files:
Pid          User(ID)   DenyMode   Access      R/W        Oplock           SharePath   Name   Time
--------------------------------------------------------------------------------------------------
3443         1000       DENY_NONE  0x12019f    RDWR       EXCLUSIVE+BATCH  /sambashare   output.txt   Fri Oct  8 04:50:04 2021

It has essentially set the access mask + oplock, but not the DenyMode.

Is there an additional flag we can set to set DENY_WRITE?

Many thanks in advance.

Neither exclusive nor shared locks working as expected on Windows

Morning!

I am trying to use portalocker for some system-wide resource management under Windows.
When I run either of the following from two different programs:
portalocker.lock(fileHandle, portalocker.LOCK_EX)
Or
portalocker.lock(fileHandle, portalocker.LOCK_SH)

I always get the following on the second program:
"LockException: (1, 'The process cannot access the file because another process has locked a portion of the file.')"

My expectation was that exclusive lock is a write lock and the shared lock is a read lock, but instead both:

  • exhibit exclusive locking in Python
  • when locked:
    • can be read from notepad
    • cannot be written from notepad
    • cannot be moved from file explorer

Rlock - with limit?

Hi, thanks for the library.

I noticed a PR in 2017 that introduces a reentrant lock but the documentation doesn't seem to cover RLock.

I was wondering if it is possible to enhance RLock with a lock count? In other words, I'd like to only allow n reentrant requests across all processes.

My usecase may be odd, but I thought it was very appropriate to try and use portallocker for my purpose: I have a GPU that has 4GB of RAM. I have multiple processes that try and invoke the GPU for deep learning operations and beyond 2-3 simultaneous calls, the GPU crashes. My idea therefore was to use this library to limit only 2 reentrant calls and make the others wait till the count goes back down. I could of course go the route of a queue, but this seems much simpler.

thanks.

Lock.acquire() does not have a matching release() method.

My understanding is that the Python convention on locks is to have methods acquire and release for callers who want to manage releasing themselves, and __enter__ and __exit__ for those using the lock as a context manager. portalocker.Lock has everything but release.

What are the different locking mechanisms?

Hi Rick, thanks for this portalocker library! It has been reliably serving us for quite a while. :-)

We recently encountered a cross-library issue likely caused by different locking mechanisms. Do you have some input on what are the pros and cons of the two different approaches, and why did you choose the one currently used in portalocker?

tests directory

Hi.

Please, remove the tests directory of the distribution, it conflicts with other packages.

Thank you!

race condition for… file metadata?

look, I know how weird this looks, but hear me out.
I somehow managed to get a race condition(?) where portalocker.open('file/path', 'a') would not actually get me at the end of the file I opened.
There was a lot of moving parts in this, with potential culprits such as:

  • an NFS filesytem
  • accessing a file from different machines within a cluster
  • an EOL'd kernel on every machine of said cluster (!)

here are a few small working examples.

script1.py

import portalocker
with portalocker.Lock('file/path','a') as f:
    f.write('prelude,')
    print(f.tell())
    input('start script2 and continue')
    f.write('AAAA')

script 2, variant a:

import portalocker
with portalocker.Lock('file/path','a') as f:
    print(f.tell())
    f.write('BBBB')

script 2, variant b:

import portalocker
from time import sleep
with portalocker.Lock('file/path','a') as f:
    print(f.tell())
    print(f.seek(0,2))
    sleep(5)
    print(f.tell())
    print(f.seek(0,2))

I emptied the file between all tests.

test 1: running script1 on node1, then quickly script2a on node2:
script1 prints 8\nstart script2 and continue, script2 prints 0, and the file contains BBBBude,AAAA

test 2: running script1 on node1, then quickly script2b on node2:
script2 prints 0\n0\n0\n12. (although the results can change with the delay set in sleep())

test 3: running script1 on node1, waiting, and running script2b on node2:
script2 prints 8\n8\n8\n12 or 8\n8\n12\n12.

test 4: running script1 on node1, waiting, and running script2b on node2:
script2 prints 8\n12\n12\n12?

I have yet to make tests where one or more of those scripts run on the controller node (which is the target of the NFS mount involved)

I am aware this might not be an issue on your side specifically, but like… yeah.
I hope I'm not wasting your time with this.

which cases would lead to exception in unlock

I find unlock function tries to catch several exceptions. However, which cases would lead to exception in unlock? Could you give some examples?

Should we always use unlock with try...catch?

image

Shared locks with msvcrt backend

I wrote a script a while ago using portalocker 0.5.5, which locked a file with LOCK_SH | LOCK_NB, then copied it with shutil.copyfile(). This prevented modifying the file during copying (large files, shared network drives).

I recently upgraded to 1.0.0, then the copy failed with a Permission denied error. I checked the changes, and verified, that the old win32file.LockFileEx() function allows shutil.copyfile() to access the file locked with a shared lock (but prevents modifications), but the new msvcrt.locking() function does not (I solved my issue by using shutil.copyfileobj() instead)

I checked the documentation of msvcrt.locking() (which is used since 0.6), and it says nothing about sharing the lock with other processes. It even states, that _LK_NBRLCK (equivalent to LOCK_SH | LOCK_NB) is the same as _LK_NBLCK (equivalent to LOCK_EX | LOCK_NB).

Does this still count as a shared lock?

Fix `ResourceWarning` when `AlreadyLocked`

This is issue #50 appearing on another line.

When acquire-ing a portalocker.Lock, there are parameters fail_when_locked for indicating whether to raise an AlreadyLocked exception if the first attempt in locking fails, and timeout for indicating how long to attempt locking for if locking fails.

The unit test test_with_timeout suggests that a valid combination of arguments can be to fail_when_locked with a positive timeout.

with portalocker.Lock(tmpfile, timeout=0.1, mode='wb',
                      fail_when_locked=True):

So there should be no warnings issued from using it. In particular, no ResourceWarning from not closing a file handle.

Expected Behavior

In the following, we lock a temporary file, and then try to acquire another lock on the same file again. All ResourceWarning-s are set to be shown to check if they are issued.

#!/usr/bin/env python3

import portalocker
import tempfile
import warnings

warnings.filterwarnings('always', category=ResourceWarning)
with tempfile.NamedTemporaryFile() as pid_file:
    with portalocker.Lock(filename=pid_file.name):
        try:
            portalocker.Lock(
                fail_when_locked=True,
                filename=pid_file.name,
                timeout=0.1,
            ).acquire()
        except portalocker.exceptions.AlreadyLocked:
            pass
        else:
            raise RuntimeError('Expected AlreadyLocked.')

The acquire docstring says that

fail_when_locked is useful when multiple threads/processes can race
when creating a file. If set to true than the system will wait till
the lock was acquired and then return an AlreadyLocked exception.

So the acquire call should raise an AlreadyLocked exception. There should be no ResourceWarning-s issued.

Current Behavior

The exception is indeed raised. However, the warning is issued.

$ ./test_portalocker.py
/home/boni/base/src/tmp/./test_portalocker.py:17: ResourceWarning: unclosed file <_io.TextIOWrapper name='/tmp/tmpkk2prhbu' mode='a' encoding='UTF-8'>
  pass
ResourceWarning: Enable tracemalloc to get the object allocation traceback

This occurs because the file handle to the pid_file used for locking was not closed before raising the error.

fh = self._get_fh()
try:
    fh = self._get_lock(fh)
except exceptions.LockException as exception:
    timeout_end = current_time() + timeout
    while timeout_end > current_time():
        time.sleep(check_interval)
        try:
            if fail_when_locked:
                # ---- HERE ----
                # The `fh` is closed for timeout in `else` below,
                # but not for failing.
                raise exceptions.AlreadyLocked(exception)
            else:
                fh = self._get_lock(fh)
                break
        except exceptions.LockException:
            pass
    else:
        fh.close()
        raise exceptions.LockException(exception)

# Prepare the filehandle (truncate if needed)
fh = self._prepare_fh(fh)

self.fh = fh
return fh

Possible Solution

The simplest fix is to add fh.close() before raising the exception.

A more defensive alternative, guarding also against say KeyboardInterrupt, is to wrap the whole thing with a finally clause to close the handle. (There is also dropping Python 2.7 support and use a context manager!)

try:
    fh = self._get_fh()
    ...
    return fh

finally:
    if self.fh is None:
        fh.close()

I am not sure how to tell coverage that fh.close() -> return fh is not possible though.

Context (Environment)

I am using Debian bullseye/sid, Python 3.9.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.