Comments (10)
Thanks again for the detailed comments! I'll try to respond to everything point-by-point, but if I miss something let me know.
API & HistoryT
I really like the idea of a HistoryT
monad and having History = HistoryT Id
. I think this shouldn't be too hard to do and I'd be open to a pull request for this.
I've thought about moving this monad into a separate library to make it more general purpose. As a separate library, this wouldn't depend on subhask at all, which I think would make it much easier to adopt. I haven't done this yet because I think it would slow down my development time a bit. But if you think it'd be useful I'd be up for it. Can you say a bit more about your use case?
Relationship to Criterion
Totally agree. That's been one of my wishlist features for a while now. Another feature I wanted added is to measure perf
timing events like cache misses and branch mispredicts that happen during a reporting period. I just haven't had a chance to do this yet.
SubHask
One of my goals with subhask is to make all instances of Control.Monad.Monad
automatically an instance of SubHask.Monad.Monad
. There's already some template haskell code that works for most monads, and it works for the History
monad just fine. That's one of the reasons that moving the history monad to a separate library wouldn't be too big a deal.
I haven't yet implemented anything related to monad transformers in subhask. The reason is I haven't thought enough about the consequences of having a monad in one category transform a monad in another category. This shouldn't be an obstacle to making HistoryT
though since it'll (at least at this point) on being transforming monads in Hask
.
Reportable versus Optimizable
I don't think there should be any issues when using History
on non-looping things. I've done some simple tests that way, although I don't think any of them made it into the repo.
The Optimizable
constraint should really be called Reportable
since it's not just for optimization any more.
Now to address a point you didn't bring up. There's an ugly side to the History
monad right now in that it makes type signatures a pain. See the type signature:
fminunc :: (Optimizable a, OrdField a) => (a -> a) -> a
in the Univariate.hs file.
I want to get the Optimizable a
constraint removed from the type signature. It's not needed when the History
monad is evaluated using the evalHistory
function. Last time I thought about this was when using ghc 7.8, so there might be some new features that would let me do this, but I haven't looked into it in a while.
from hlearn.
My use case for History is pretty much yours, lol. See https://github.com/tonyday567/digit-recognizer/blob/master/testing/BenchKnn.hs#L208. I'd like to replace all the time and timeIOs with report and reportIOs (or similar), and include the info wrapped up in Criterion.Extended.
wrt the ugly side, that sig is due to infoType. If you gave up on automatically using the type as the report collection label, and accepted a hard coded label (of type s) eg
report :: s -> a -> History_ s a
or you could go the whole hog, and use a String as the label as beginFunction does, making it:
newtype History a = History (ReaderT DisplayFunction (StateT (String, [Report]) IO) a
report :: String -> a -> History a
or even
newtype (MonadIO m) => HistoryT m a = HistoryT (ReaderT DisplayFunction (StateT (String, [Report] m) a
from hlearn.
I went back and looked at the History
monad a bit more today. I managed to refactor out the Optimization
constraint from all the type signatures. I'm quite a bit happier with how it looks now. I've just uploaded the changes to GitHub: https://github.com/mikeizbicki/HLearn/blob/ghc7.10/src/HLearn/Optimization/Univariate.hs
There's still a few odd constraints on some of the functions. I think I can remove them too, but I'm off to bed now :)
from hlearn.
Looking at how to integrate History.Timing with History, I ran into a brick wall.
The problem that I'm trying to solve is that the Report/Measure thing is related to the context used in the step. So Report as written matches CountInfo as written (and also goes with the hardcoded getCPUTime in runHistory for example).
My brain then finds it difficult to abstract these relationships for the stepDisplayFunction sig, forall a. cxt a => Report -> cxt -> a -> (cxt, IO ())
given the commonalities between Report and cxt.
The ideal would be for Report/Context to be built up using disparate effects. The components look like pre-computation data gathering (eg getCPUTime), post-computation compute, a running total and display of the data. I came up with this data type:
data Measure = forall a b. (Monoid a, Monoid b) => Measure
{ measure :: b
, prestep :: IO a
, poststep :: a -> b -> IO b
, display :: b -> String
}
An experiment using this is here: https://github.com/tonyday567/HLearn/blob/ghc7.10dev/src/HLearn/History/Measure.hs
But which turned out to be a deadend - it's hard to use with the types being swallowed.
from hlearn.
This change shouldn't require adding a new Measure
type or really interacting with the Report
type at all.
We can create two new functions that are analogues of the time
and timeIO
functions like:
withMsg :: (cxt String, NFData a) => String -> a -> History_ cxt s a
withMsg msg a = withMsgIO msg (return a)
withMsgIO :: (cxt String, NFData a) => String -> IO a -> History_ cxt s a
withMsgIO msg ioa = do
a <- History $ liftIO ioa
report $ deepseq a $ msg
return a
I haven't actually tested these functions, but I think they should work.
Then the next step is to write a DisplayFunction_
that processes the History
monad in a way that recreates the output that timeIO
did. I think this should only require writing the stepDisplayFunction
in a way that doesn't use the s
state variable. In particular, the specialized type signature would read:
stepDisplayFunction :: forall a. cxt a => Report -> () -> a -> ((), IO ())
Then in order to get the message printed to the screen, stepDisplayFunction
will run show
on the input variable of type a
, requiring cxt ~ Show
.
Does that explanation make sense?
from hlearn.
I think so - will give it a try! It won't give you real time information though.
The part I left out was that I'd like (personally) to collect GC information that you can get from GHC.Stats, so I wanted to write:
import GHC.Stats
data Report = Report
{ cpuTimeStart :: !CPUTime
, cpuTimeDiff :: !CPUTime
, gcStatsStart :: !GCStats
, gcStatsDiff :: !GCStats
, numReports :: {-#UNPACK#-}!Int
, reportLevel :: {-#UNPACK#-}!Int
}
deriving Show
but that would then involve a rewrite of runHistory and report. So I was thinking of how to generalise (adding cache misses etc, without having to rewrite all the time).
from hlearn.
Ahh... I see. That's actually a really interesting idea... I'll have to start thinking about that too.
from hlearn.
I did some testing of rdtsc and it has outstanding metrics, compared with getCPUTime and getCurrentTime. It has to be the future of high performance regressions.
https://github.com/tonyday567/perf
from hlearn.
That looks super nice!
After a quick look through, this might be using the same API I was hoping to use to measure cache performance.
from hlearn.
I don't think you're going to be that lucky, sorry.
You're looking for the rdpmc instruction. It's a lot more complicated and I haven't seen a clean *.c for this - it needs drivers and all of that guff. The linux perf tool is only a command line thing, and the haskell wrappers out there read the data file that this command creates.
from hlearn.
Related Issues (20)
- cabal sandbox init step HOT 2
- hlearn-distributions depends on constrainkinds-1.1.0.0 which fails to install HOT 8
- Interested in working on HLearn. HOT 2
- Build issues. HOT 1
- Is HLearn intentionally kept of hackage / stackage? HOT 2
- Blog post on categorical distribution has deprecated instructions
- Invalid subhask submodule HOT 6
- Naive Bayes classifier HOT 2
- Helping out? HOT 4
- A couple of questions HOT 13
- stack haddock fails
- Interested in contributing HOT 5
- Migration to ghc-8.0 with subhask-branch ghc-8.0 HOT 1
- Interested in contribution
- Example of using cover tree? HOT 1
- Contributing HOT 1
- Interested in contributing - density estimation, wavelets
- Not very easy to start with HOT 1
- SubHask build error during install script HOT 2
- Interested in contributing HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hlearn.