GithubHelp home page GithubHelp logo

HLearn.History about hlearn HOT 10 OPEN

mikeizbicki avatar mikeizbicki commented on June 20, 2024
HLearn.History

from hlearn.

Comments (10)

mikeizbicki avatar mikeizbicki commented on June 20, 2024

Thanks again for the detailed comments! I'll try to respond to everything point-by-point, but if I miss something let me know.

API & HistoryT

I really like the idea of a HistoryT monad and having History = HistoryT Id. I think this shouldn't be too hard to do and I'd be open to a pull request for this.

I've thought about moving this monad into a separate library to make it more general purpose. As a separate library, this wouldn't depend on subhask at all, which I think would make it much easier to adopt. I haven't done this yet because I think it would slow down my development time a bit. But if you think it'd be useful I'd be up for it. Can you say a bit more about your use case?

Relationship to Criterion

Totally agree. That's been one of my wishlist features for a while now. Another feature I wanted added is to measure perf timing events like cache misses and branch mispredicts that happen during a reporting period. I just haven't had a chance to do this yet.

SubHask

One of my goals with subhask is to make all instances of Control.Monad.Monad automatically an instance of SubHask.Monad.Monad. There's already some template haskell code that works for most monads, and it works for the History monad just fine. That's one of the reasons that moving the history monad to a separate library wouldn't be too big a deal.

I haven't yet implemented anything related to monad transformers in subhask. The reason is I haven't thought enough about the consequences of having a monad in one category transform a monad in another category. This shouldn't be an obstacle to making HistoryT though since it'll (at least at this point) on being transforming monads in Hask.

Reportable versus Optimizable

I don't think there should be any issues when using History on non-looping things. I've done some simple tests that way, although I don't think any of them made it into the repo.

The Optimizable constraint should really be called Reportable since it's not just for optimization any more.


Now to address a point you didn't bring up. There's an ugly side to the History monad right now in that it makes type signatures a pain. See the type signature:

fminunc :: (Optimizable a, OrdField a) => (a -> a) -> a

in the Univariate.hs file.

I want to get the Optimizable a constraint removed from the type signature. It's not needed when the History monad is evaluated using the evalHistory function. Last time I thought about this was when using ghc 7.8, so there might be some new features that would let me do this, but I haven't looked into it in a while.

from hlearn.

tonyday567 avatar tonyday567 commented on June 20, 2024

My use case for History is pretty much yours, lol. See https://github.com/tonyday567/digit-recognizer/blob/master/testing/BenchKnn.hs#L208. I'd like to replace all the time and timeIOs with report and reportIOs (or similar), and include the info wrapped up in Criterion.Extended.

wrt the ugly side, that sig is due to infoType. If you gave up on automatically using the type as the report collection label, and accepted a hard coded label (of type s) eg

report :: s -> a -> History_ s a

or you could go the whole hog, and use a String as the label as beginFunction does, making it:

newtype History a = History (ReaderT DisplayFunction (StateT (String, [Report]) IO) a
report :: String -> a -> History a

or even

newtype (MonadIO m) => HistoryT m a = HistoryT (ReaderT DisplayFunction (StateT (String, [Report] m) a

from hlearn.

mikeizbicki avatar mikeizbicki commented on June 20, 2024

I went back and looked at the History monad a bit more today. I managed to refactor out the Optimization constraint from all the type signatures. I'm quite a bit happier with how it looks now. I've just uploaded the changes to GitHub: https://github.com/mikeizbicki/HLearn/blob/ghc7.10/src/HLearn/Optimization/Univariate.hs

There's still a few odd constraints on some of the functions. I think I can remove them too, but I'm off to bed now :)

from hlearn.

tonyday567 avatar tonyday567 commented on June 20, 2024

Looking at how to integrate History.Timing with History, I ran into a brick wall.

The problem that I'm trying to solve is that the Report/Measure thing is related to the context used in the step. So Report as written matches CountInfo as written (and also goes with the hardcoded getCPUTime in runHistory for example).
My brain then finds it difficult to abstract these relationships for the stepDisplayFunction sig, forall a. cxt a => Report -> cxt -> a -> (cxt, IO ()) given the commonalities between Report and cxt.

The ideal would be for Report/Context to be built up using disparate effects. The components look like pre-computation data gathering (eg getCPUTime), post-computation compute, a running total and display of the data. I came up with this data type:

data Measure = forall a b. (Monoid a, Monoid b) => Measure
    { measure :: b
    , prestep :: IO a
    , poststep :: a -> b -> IO b
    , display :: b -> String
    }

An experiment using this is here: https://github.com/tonyday567/HLearn/blob/ghc7.10dev/src/HLearn/History/Measure.hs

But which turned out to be a deadend - it's hard to use with the types being swallowed.

from hlearn.

mikeizbicki avatar mikeizbicki commented on June 20, 2024

This change shouldn't require adding a new Measure type or really interacting with the Report type at all.

We can create two new functions that are analogues of the time and timeIO functions like:

withMsg :: (cxt String, NFData a) => String -> a -> History_ cxt s a
withMsg msg a = withMsgIO msg (return a)

withMsgIO :: (cxt String, NFData a) => String -> IO a -> History_ cxt s a
withMsgIO msg ioa = do
    a <- History $ liftIO ioa
    report $ deepseq a $ msg
    return a

I haven't actually tested these functions, but I think they should work.

Then the next step is to write a DisplayFunction_ that processes the History monad in a way that recreates the output that timeIO did. I think this should only require writing the stepDisplayFunction in a way that doesn't use the s state variable. In particular, the specialized type signature would read:

stepDisplayFunction  :: forall a. cxt a => Report -> () -> a -> ((), IO ())

Then in order to get the message printed to the screen, stepDisplayFunction will run show on the input variable of type a, requiring cxt ~ Show.

Does that explanation make sense?

from hlearn.

tonyday567 avatar tonyday567 commented on June 20, 2024

I think so - will give it a try! It won't give you real time information though.

The part I left out was that I'd like (personally) to collect GC information that you can get from GHC.Stats, so I wanted to write:

import GHC.Stats

data Report = Report
    { cpuTimeStart  :: !CPUTime
    , cpuTimeDiff   :: !CPUTime
    , gcStatsStart   :: !GCStats
    , gcStatsDiff    :: !GCStats
    , numReports    :: {-#UNPACK#-}!Int
    , reportLevel   :: {-#UNPACK#-}!Int
    }
    deriving Show

but that would then involve a rewrite of runHistory and report. So I was thinking of how to generalise (adding cache misses etc, without having to rewrite all the time).

from hlearn.

mikeizbicki avatar mikeizbicki commented on June 20, 2024

Ahh... I see. That's actually a really interesting idea... I'll have to start thinking about that too.

from hlearn.

tonyday567 avatar tonyday567 commented on June 20, 2024

I did some testing of rdtsc and it has outstanding metrics, compared with getCPUTime and getCurrentTime. It has to be the future of high performance regressions.

https://github.com/tonyday567/perf

from hlearn.

mikeizbicki avatar mikeizbicki commented on June 20, 2024

That looks super nice!

After a quick look through, this might be using the same API I was hoping to use to measure cache performance.

from hlearn.

tonyday567 avatar tonyday567 commented on June 20, 2024

I don't think you're going to be that lucky, sorry.
You're looking for the rdpmc instruction. It's a lot more complicated and I haven't seen a clean *.c for this - it needs drivers and all of that guff. The linux perf tool is only a command line thing, and the haskell wrappers out there read the data file that this command creates.

from hlearn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.