92 Commits

Author SHA1 Message Date
59b1bf5cda Add IWRAM section attribute macro to mimic -ffunction-sections
See the comment added in this commit for details. Currently unused.
Would save some IWRAM usage in the matrix implementation at the cost of
readability. If one operation on a matrix size is used, most other
operations will likely be used too, so in practice this may not change
IWRAM usage much. Only including the matrix sizes that are used in the
final binary would likely have a greater impact. FURTHER TESTING
REQUIRED.
2024-10-08 00:00:22 -07:00
64ae33535b Remove tests from IWRAM
While some parts of the tests should be contained in IWRAM for profiling
purposes, the majority of test code shoud not.
2024-10-07 23:59:18 -07:00
a449c4f749 Add conditional test compilation and split matrix instantiations
This commit adds the ability to conditionally compile tests. To
implement this, common source files needed to be moved to a subdirectory
`common`.

Additionally, this commit splits each `mat` template instantiation into
a separate source file. This enables the linker to discard unused
instantations. If each instantiation is placed in the same file and also
into the .iwram section, the linker will include every symbol in the
resulting binary, leading to an extremely high IWRAM usage. To introduce
this split, a new header file "mat_impl.hpp" was added containing
implementation details for the template instantiations. "mat_impl.hpp"
should ONLY be included by the source files containing explicit
instantiations of `mat`.
2024-10-07 23:49:01 -07:00
094a4731c5 Modify fixed-point numbers to use 8 bits for the decimal point
This was done because with 6 bits of precision, when computing a
projection matrix error would accumulate up to 0.078. Changing the
decimal point precision to 8 bits minimizes the affect of this error,
reducing it closer to 0.016. Although, this does decrease the maximum
value from around 33,000,000 to around 8,000,000, although this
shouldn't be an issue.
2024-09-19 19:30:25 -07:00
8f30ed4311 Add vector projection test 2024-09-19 18:10:05 -07:00
982752404e Implement vec::transpose, vec::operator==, vec::operator!= and vec
string output
2024-09-19 18:08:51 -07:00
a72caf6b46 Add matrix and tests 2024-09-19 18:07:16 -07:00
41ea3b2ee5 Force loop unrolling in vec
We can't push/pop optimize options because they don't apply for inlined
functions. Function attributes also won't apply for inlined functions.
Because most (if not all) vector operations are inlined, neither of
these are appropriate options. However, GCC 8.1 introduces a new pragma,
unroll, that allows us to unroll specific loops. This pragma does apply
for inlined functions.
2024-09-19 14:43:17 -07:00
810750febb Add vector unit tests and timing
Adds magnitude_sqr tests
2024-09-17 09:05:31 -07:00
91b4f47d9d Add timing to fixed unit tests 2024-09-17 09:04:42 -07:00
23fc152e6f Add ability to start/stop test timer at specific points
This timer is expected to be global, so this would make paralellizing
tests difficult. However, because the main target of this project (the
GBA) does not support parallelization, this is not a concern at the
moment.
2024-09-12 15:18:15 -07:00
7eef3819ad Fix incorrect size of GBA timer registers in test code 2024-09-12 15:16:34 -07:00
64f44984b6 Rewrite math vector implementation and add tests
This new implementation consolidates vec2/3/4 into one base class and
improves performance.
2024-08-30 15:29:51 -07:00
b7decb3923 Add fixed point multiplication tests 2024-08-29 22:07:08 -07:00
89424a914e Rewrite test framework
This new framework does not automatically register test suites, but it
is much simpler to use. I might revisit the old approach later, but for
now this works, KISS.
2024-08-18 23:55:17 -07:00
c1fa39cd8c Add string_view equal/not equal comparison operators 2024-08-18 23:51:22 -07:00
79b3990339 Fix initializer list reordering warnings
Caused because initializer list must be in the same order as the member
declarations.
2024-08-18 23:50:14 -07:00
5742885021 Fix incorrect return type on fixed point comparison operators 2024-08-18 23:49:01 -07:00
543218e489 Add fixed point comparison operators 2024-08-18 14:10:44 -07:00
0711ec1fc1 Add test running 2024-08-18 14:04:09 -07:00
589676854c Replace TARGET_ARM_MODE pragma with ARM_MODE function attributes 2024-08-03 17:28:27 -06:00
eb4412847e Add noexcept specifier to fixed point operations 2024-08-03 16:18:20 -06:00
b3a6dd48d0 Fix incorrect attribute placement on fixed::operator/
Function attributes should be placed in the function declaration, not
definition.
2024-08-03 16:15:17 -06:00
5e4c492894 Replace fixed point multiplication with C++ implementation
Fixed point multiplication used an ARM inline assembly routine. This was
fast, but unfortunately, caused some odd attempted inlining problems
when used from Thumb-mode code. This commit replaces this assembly
routine with a C++ implementation that performs equal or better than the
assembly routine in most cases. The C++ implementation is slightly
slower when called from Thumb-mode code because GCC inlines the
operation instead of calling a standalone ARM-mode routine placed in
IWRAM. The performance tradeoff is acceptable though because of the
fixes, portability, and ARM-mode performance improvements it provides.
2024-08-03 16:08:22 -06:00
cc7c346f84 Add fixed point assignment operators 2024-08-02 23:27:13 -06:00
d0557bad3e Fix incorrect fixed point division result
Division was not returning the result as raw.
2024-08-02 23:26:24 -06:00
8d23f2cf09 Fix fixed point number compilation failures during attempted inlining
Currently, some fixed point operations (notably multiplication) fail to
compile when used in Thumb-mode routines. This occurs because GCC
attempts to inline the operation into the Thumb-mode routine, but the
operation uses ARM-mode only instructions. This commit adds the ".arm"
directive into the inline assembly of the implementation, which informs
GCC that the assembly uses ARM-mode instructions and prevents inlining.
As a result, fixed point numbers can be used from both ARM-mode and
Thumb-mode code without issues! Usage in ARM-mode should still be
preferred for optimal performance though.
2024-08-02 22:10:39 -06:00
5fced73f46 Add fixed point number unary negation operator 2024-08-02 22:07:04 -06:00
233512f5b4 Add initial vector4 implementation 2024-07-30 11:49:07 -06:00
2181557d9d Remove conditional fixed point number inlining
Caused issues with ODR rule violations. Now fixed point numbers should
only be used in ARM-mode. Attempting to use them in Thumb-mode will
cause a compilation failure. This commit also moves operator/ into IWRAM
on the GBA.
2024-07-30 11:45:08 -06:00
62da9d03c1 Add GBA section target macros 2024-07-30 11:43:21 -06:00
64763adf47 Add fixed point subtraction 2024-07-29 22:47:37 -06:00
9561686585 Optimize and expand fixed point number implementation
Before this commit, fixed point multiplication was implemented using an
assembly routine in a separate translation unit. This commit implements
this routine directly using inline assembly. By doing so, these
operations can be inlined when called from ARM code. Fixed point
division is implemented as well, along with various documentation and
style improvements.
2024-07-28 19:24:47 -06:00
f2862f7c96 Add log::stream_type typedef
This addition is useful when an explicit template instantiation of
operator<< is needed, for example, when logging from an ARM mode
function. Example usage:

template mtl::log::stream_type& mtl::log::stream_type::operator<<
<uint16_t>(uint16_t);

ARM_MODE void foo(int16_t x) {
	mtl::log::debug << x; // Without the explicit template
	instantiation, an ODR violation would occur.
}
2024-07-28 18:08:21 -06:00
45a1690032 Remove implicit inline attribute from ALWAYS_INLINE target option
May cause issues when multiple non-standard attributes are used.
Standard attributes must come before all non-standard attributes.
2024-07-27 18:17:42 -06:00
4778b477ec Add architecture specific target option macros
These macros are defined in target.hpp. This commit adds the macros:

NOINLINE - Never inline the function
ALWAYS_INLINE - Force the function to be inlined (also adds the inline
attribute to the function)

TARGET_ARM_MODE - Compile all future functions in ARM mode until
TARGET_END_MODE is reached. No-op if not compiling for ARM.
TARGET_THUMB_MODE - Compile all future functions in thumb mode until
TARGET_END_MODE is reached. No-op if not compiling for ARM.
TARGET_END_MODE - Undo the last TARGET_*_MODE option.

ARM_MODE - Compile this function in ARM mode.
THUMB_MODE - Compile this function in thumb mode.
2024-07-27 17:52:39 -06:00
1af4b652b8 Add initial queue implementation 2024-07-26 11:03:08 -06:00
c48e03dc26 Add initial vector implementation 2024-07-25 22:55:47 -06:00
3bba697698 Completely rewrite FSM implementation
Previously, each FSM class could only have one instance because of the
use of globals. This new implementation only uses memory allocated on
the stack, so multiple instances can be created at once. Dynamic
allocation is still unused. Additionally, this approach uses a more
logical separation between the FSM, states, and events.
2024-07-21 21:55:13 -06:00
03eaef3f48 Add default constructor to string_view 2024-06-22 16:43:45 -06:00
f87111abbf Add ability to disable logging
If MTL_LOGGING_DISABLED is defined, logging is disabled. Otherwise, it
is enabled.
2024-06-19 19:36:26 -06:00
68dd09d561 Add additional documentation for log 2024-06-19 19:32:46 -06:00
f354b2d733 Change log level to use an enum instead of uint 2024-06-19 19:29:45 -06:00
4dd979ef54 Change basic_string_stream to not clear the buffer on flush
If the buffer is cleared when flushed, the class does not function
correctly as a string builder. For example, if a string is built with a
newline inside, everything before the newline will be cleared and the
string will be incomplete. Clearing the buffer on flush only makes sense
for applications such as logging or writting to a file.
2024-04-10 08:46:21 -06:00
82f60f4767 Refactor string_stream and string_streamx, and provide ENDL selection
option

This commit refactors string_stream and string_streamx into a common
basic_string_stream template class. When the EXT template == true, the
string_streamx formatting options are enabled, they default to disabled.
These options are enabled at compile time, and do not affect performance
when they are disabled. By implementing the two streams in this manner,
duplicated code is removed.

This commit also adds the ENDL template paramter. When ENDL is set to
zero, no endline character is printed when piping mtl::endl. Otherwise,
the character is printed. Defaults to '\n'. This allows logging on the
GBA to handle mtl::endl correctly and not print two newlines on the MGBA
emulator.
2024-04-10 08:39:30 -06:00
554c88f7f2 Fix udiv100000 and udiv1000000000 ASM macros changing the source
register
2024-04-04 04:18:36 -06:00
813a499734 Change architecture dependent compilation to not use janky hacks
Previously didn't add the source to the target if a source of the same
name already existed. This was janky because these files would be
considered the same: src/foo.cpp src/armv4t/bar/baz/foo.cpp, even though
they really shouldn't. What should happen instead, is that the symbols
of the architecture-specific code should not be overridden by the common
implementation regardless of where the file is placed. This means that
if the files src/foo.cpp and src/armv4t/bar.cpp contain implementations
of the function foo, the armv4t implementation will be exported
even though it uses a different filename from the common
implementation. This commit implements this behaviour by using
the way symbols are naturally resolved. Multiple smaller
libraries are built for each architecture dependent code.
Afterwards the libraries are linked into one, with the arch specific
libraries linked first.
2024-04-04 04:11:07 -06:00
d1155befdb Fix inaccurate implementation of udiv100000 and udiv1000000000
Both assembly macros failed when given large numbers ending in 9. For
example, udiv100000 of 3999999999 produced 40000 instead of 39999.
Similarly, udiv1000000000 of 3999999999 produced 4 instead of 3.

Both of the previous implementations failed the Granlund-Montgomery
integer division algorithm. This commit replaces these macros with the
correct implementation generated by clang for a constant integer
division. I do not understand how this implementation works. All other
macros do pass the Granlund-Montgomery algorithm.
2024-04-01 19:45:51 -06:00
1b48b5ec80 Change common to_string implementations to use snprintf instead of itoa 2024-03-27 00:59:39 -06:00
4e56cb5269 Fix type stdin instead of stdout 2024-03-27 00:59:10 -06:00